Recent activity
Subscribe to this feed
Tomas Ruiz marked one of Yasser's replies in LUCI as useful. Yasser replied to the question "Question for URLs".
Tomas Ruiz asked a question in LUCI on March 03, 2009 01:39:
URLs/Doc idsDo we have to report the URLs in which our name appears or just their doc ids?
Tomas Ruiz replied on March 03, 2009 01:35 to the question "Cannot view job details from job tracker" in LUCI:
Tomas Ruiz replied on March 02, 2009 17:27 to the question "Server down?" in LUCI:
Tomas Ruiz replied on March 01, 2009 18:39 to the question "Server down?" in LUCI:
Tomas Ruiz replied on March 01, 2009 18:17 to the question "Server down?" in LUCI:
Tomas Ruiz replied on February 23, 2009 00:17 to the question "Login Exception" in LUCI:
Tomas Ruiz asked a question in LUCI on February 22, 2009 18:43:
Login ExceptionI just started with Assignment 6 but I failed in the beginning. I was trying to run hadoop in "local-standlaone" mode, but I got this exception:
javax.security.auth.login.LoginException: Login failed: Cannot run program "whoami": error=2, No such file or directory
at org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupInformation.java:250)
at org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupInformation.java:275)
at org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupInformation.java:257)
at org.apache.hadoop.security.UserGroupInformation.login(UserGroupInformation.java:67)
at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:1438)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1376)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:215)
...
Does anyone know what's happening here?</init>
Tomas Ruiz replied on February 21, 2009 01:23 to the question "2 Problems" in LUCI:
The two [ ] are due to the fact that, before calling Reduce, it calls the Combiner, but we are setting the Combiner to do the same that Reduce:
conf.setCombinerClass(Reduce.class);
If you want just one [ ], maybe you have to check the elements returned by your Iterator (in the Reduce class) already have the [ ], and then removed it and put them at the end of the method (I don't know if I'm making sense). Another solution may be writing different classes for the Combiner and the Reducer... which would do the same jod but just adding the [ ] at the end.
Tomas Ruiz replied on February 14, 2009 00:26 to the question "Hadoop distributed file system randomly works." in LUCI:
Tomas Ruiz replied on February 13, 2009 21:35 to the question "Setting classpath in Hadoop" in LUCI:
Tomas Ruiz replied on February 13, 2009 21:33 to the question "Cleaning up the text and words" in LUCI:
Tomas Ruiz replied on February 13, 2009 06:30 to the question "Setting classpath in Hadoop" in LUCI:
I already tried that, I put all the crawler4j jar files in the lib folder, but I'm still having the classNotFoundException, concretely this one: it.unimi.dsi.parser.callback.Callback
It happens when I'm creating the HTML parser.
I have another issue, I created the ssh key with no password, but it is asking me to introduce a password when I run the "start-all" script. Is that OK?
Tomas Ruiz asked a question in LUCI on February 13, 2009 01:26:
Setting classpath in HadoopHow can I set up the classpath when running hadoop? I need to specify where is the crawler4j.jar since when I try to run my program, I have the ClassNotFoundException, and I have no idea on how to do this.
Tomas Ruiz asked a question in LUCI on January 23, 2009 03:28:
Questions about assignment 3Hi, I have some questions regarding the third assignment. Maybe they will be covered next Monday during discussion, but I'd like to spend the weekend doing this assignment. My questions are the following:
- We should crawl the pages which match the regular expression but, should we also check if they are inside wikipedia? I mean, if the URL starts with http://en.wikipedia.org.
- Does it exist any method in the libraries to get the size of the downloaded content from a site? And for the text? My idea is just count a byte for each character in the text (as well as in the HTML).
- How can we restore the process if the crawling fails?
- Do we have to save all the doc ids? I know we just need the one of the article about Obama, but you may want more for the forth assignment.
- Do we have to change something in the way we measure the length of palindromes, lipograms and rhopalics?
- You said we are receiving a grade according to the longest known sequence, found by whom? By you or by our peers? If it's your sequence, could you give us those lengths for us to know when we may stop crawling?
So far, I think those are all my questions. Thanks in advance.
Tomas Ruiz replied on January 18, 2009 06:59 to the question "Unable to access /extra/ugrad_space..." in LUCI:
It seems too many things to do on Windows to access to openlab. I'm using Ubuntu Linux and it is pretty much easier. You just open a file browser, type "ssh://username@machine" and it will ask you for your password. Then, you have access to this file system as you do to other devices in your computer. Same can be done from a terminal (better to execute the java program, for instance). I have some cds to install Ubuntu or execute it from the cd (versions 7.10 and 8.10). If someone wants it, just ask me for them. You also can get your own copy for free in http://shipit.ubuntu.com
Tomas Ruiz replied on January 12, 2009 16:55 to the question "Questions about assignment 2" in LUCI:
Tomas Ruiz replied on January 10, 2009 05:52 to the question "Questions about assignment 2" in LUCI:
Then, I have some doubts:
- If anything other than [A-Za-z] is considered as punctuation, it represents more than the 10% of the string, so I can't find the results provided.
- If I only count the letters [A-Za-z] in measuring the length of the strings, my program found a longer lipogram than the one in the example you provided.
If I don't consider blank spaces and count all the characters in measuring the length of the string, my program gets exactly the same output that you provided. What should I do?
tomas asked a question in LUCI on January 09, 2009 20:32:
Questions about assignment 2I have some questions about the second assignment:
- Do you consider blank spaces as punctuation marks?
- How do we measure the length of a string? I mean, should we consider blanks, punctuation marks... or only letters from 'A' to 'Z'?
- Should all the rhopalics start with a word of length 1?-
tomas started following the question "How do I get into the openlab machines?" in LUCI.
Loading Profile...
