Recent activity
Subscribe to this feed
Art replied on March 15, 2009 12:10 to the question "Firefox Extension Question" in LUCI:
Art replied on March 15, 2009 11:54 to the question "Making the Binary File" in LUCI:
Art replied on March 15, 2009 01:24 to the question "Java out of memory error" in LUCI:
Art replied on March 14, 2009 23:12 to the question "Java out of memory error" in LUCI:
Art replied on March 14, 2009 10:27 to the question "Java out of memory error" in LUCI:
I am not sure why it will have this problem even you have already setup to use 3.25 GB of the ram, but I think the conversion process shouldn't need to consume this much amount of the memory since you actually can write out the result right after you process each term, so you don't need any data structure to store the processed data.
Yasser's example in the discussion 7 slide really helps a lots:
http://www.ics.uci.edu/~djp3/classes/...
Art asked a question in LUCI on March 14, 2009 10:13:
Firefox Extension QuestionFor the extra credit assignment, do we need to create a Firefox extension that can process and return the result back to the extension window or we can just pass the search query and redirect users to a website which will process and display the result of the search?
Art replied on March 14, 2009 00:05 to the question "Making the Binary File" in LUCI:
-
Art replied on March 03, 2009 22:48 to the question "Jobs restarted?" in LUCI:
I really think the collection makes huge different. It works for me when I switch to more efficient collection in the reduce class to store the docids. Before I switched the collection, it tooks hours to do the reducing tasks after maps were completed and ended up failed due to whatever the unknown reasons were, but after switching the collection, the reducing tasks can be done in 10 minutes after map tasks in my luckiest run (for 500,000 urls). Of course there were also some other factors changed compare to both jobs, but I believe the collection plays an important role when you need to store and sort around 500,000 or more records for some of the terms in reducing process.
Art replied on March 03, 2009 01:47 to the question "URLs/Doc ids" in LUCI:
Only docid according to this post:
http://getsatisfaction.com/luci/topic...
Art replied on March 02, 2009 21:48 to the question "Why does it get to 100% MAP and 100% REDUCE and then fail?" in LUCI:
Art replied on March 02, 2009 21:43 to the question "Why does it get to 100% MAP and 100% REDUCE and then fail?" in LUCI:
Art replied on March 02, 2009 07:18 to the question "Few problems" in LUCI:
Art replied on March 02, 2009 07:16 to the question "Question for URLs" in LUCI:
If the requirement is finding the URLs which contains first name then I am having huge problem right now since I didn't store any record for the urls which contains my first name. I thought the requirement is to list the URLs which the contents contain the first name :(. Actually, I have weird first name, so if I use that first name for finding the url which contains my first name, I can guarantee there is none :(
What I do for getting the result is writing other program to analysis the result which I got from the hadoop process as in term, cf, df and docids format and generating the report.txt and postinglist.txt as what the requirement required. I am not sure if this is correct or not, but it might give you some idea at least.
Art
Art asked a question in LUCI on March 02, 2009 06:53:
Question for URLsFor the assignment06, we need to submit the URLs for the pages that contains one member's first name. Do we need to post the URL or just the docid? The URL is stored on the hadoop server, so in order for me to get the urls, I will need to either dump all the input500000 into my local to do the process or rerun the process again to get the urls for the matched docids.
Thanks,
Art
Art replied on March 01, 2009 18:24 to the question "Server down?" in LUCI:
Same here, I began the assignment at the beginning of the week, but now I am still not able to finish job with 500000 urls. I added all the counters and status update into all possible places which might cost sometimes to run, but my jobs still got killed when they were very closed to the end. Anyway, I got my 30000 input finished and processed so at least I have somethings to turn in tomorrow.
Art replied on March 01, 2009 18:11 to the question "Server down?" in LUCI:
Now the error changed:
09/03/01 10:08:41 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
org.apache.hadoop.security.AccessControlException: org.apache.hadoop.security.AccessControlException: Permission denied: user=hongfuh, access=WRITE, inode="":mdempsey:supergroup:rwxr-xr-x
Art replied on March 01, 2009 18:01 to the question "Server down?" in LUCI:
Art asked a question in LUCI on March 01, 2009 17:39:
Server down?Since 3 or 4 am today, the server is keep giving a connection refused error so none of the job can be executed normally. Here are one of the examples of the errors:
09/03/01 09:30:45 INFO ipc.Client: Retrying connect to server: palantir.ics.uci.edu/128.195.58.105:1750. Already tried 0 time(s).
09/03/01 09:30:46 INFO ipc.Client: Retrying connect to server: palantir.ics.uci.edu/128.195.58.105:1750. Already tried 1 time(s).
09/03/01 09:30:47 INFO ipc.Client: Retrying connect to server: palantir.ics.uci.edu/128.195.58.105:1750. Already tried 2 time(s).
09/03/01 09:30:48 INFO ipc.Client: Retrying connect to server: palantir.ics.uci.edu/128.195.58.105:1750. Already tried 3 time(s).
09/03/01 09:30:49 INFO ipc.Client: Retrying connect to server: palantir.ics.uci.edu/128.195.58.105:1750. Already tried 4 time(s).
09/03/01 09:30:50 INFO ipc.Client: Retrying connect to server: palantir.ics.uci.edu/128.195.58.105:1750. Already tried 5 time(s).
09/03/01 09:30:51 INFO ipc.Client: Retrying connect to server: palantir.ics.uci.edu/128.195.58.105:1750. Already tried 6 time(s).
09/03/01 09:30:52 INFO ipc.Client: Retrying connect to server: palantir.ics.uci.edu/128.195.58.105:1750. Already tried 7 time(s).
09/03/01 09:30:53 INFO ipc.Client: Retrying connect to server: palantir.ics.uci.edu/128.195.58.105:1750. Already tried 8 time(s).
09/03/01 09:30:54 INFO ipc.Client: Retrying connect to server: palantir.ics.uci.edu/128.195.58.105:1750. Already tried 9 time(s).
java.io.IOException: Call to palantir.ics.uci.edu/128.195.58.105:1750 failed on local exception: Connection refused
Art replied on January 17, 2009 05:37 to the idea "Palindromes" in LUCI:
| next » « previous |
Loading Profile...
