Get your own customer support community
 

Few problems

I finally completed my first map/reduce of 10... phew. I have a few problems that I need to correct before moving on:

I get this error at the beginning of MAP:

09/03/01 22:31:43 INFO mapred.JobClient: Task Id : attempt_200903011033_0082_m_000000_0, Status : FAILED
java.lang.NoClassDefFoundError: edu/uci/ics/crawler4j/crawler/HTMLParser
at ir.assignment05.Downloader.<init>(Downloader.java:11)
at ir.assignment05.PostingList$Map.map(PostingList.java:80)
at ir.assignment05.PostingList$Map.map(PostingList.java:1)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
at org.apache.hadoop.mapred.Child.main(Child.java:155)
Caused by: java.lang.ClassNotFoundException: edu.uci.ics.crawler4j.crawler.HTMLParser

but, the funny thing is that MAP finishes and so does reduce. Everything looks fine but that exception is scary.

Second, I output some of my statistics (URLs containing Member 1's name, etc) to a file like this:

String pathString = "./" + args[1];
String saveFile = pathString + "/report.txt";
PrintStream writer;
writer = new PrintStream(saveFile);

This works on pseudo-distributed mode. But when I do it on the cluster, there is nothing outputted in the output directory. How do I fix this?</init>
Inappropriate?
1 person has this question

User_default_medium