在eclipse中运行nutch1.9得到错误CrawlDb更新:java.io.IOException:作业失败

时间:2014-11-10 08:18:10

标签: eclipse hadoop nutch

我正试图在eclipse中运行nutch 1.9,我的所有配置都是根据这篇文章(http://yewintko.wordpress.com/2014/02/02/setting-up-nutch-in-eclipse-indigo/)。但我得到了这个错误:

CrawlDb update: starting at 2014-11-10 15:50:10
CrawlDb update: db: urls
CrawlDb update: segments: [3, crawl]
CrawlDb update: additions allowed: true
CrawlDb update: URL normalizing: false
CrawlDb update: URL filtering: false
CrawlDb update: 404 purging: false
CrawlDb update: Merging segment data into db.
CrawlDb update: java.io.IOException: Job failed!
    at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1357)
    at org.apache.nutch.crawl.CrawlDb.update(CrawlDb.java:119)
    at org.apache.nutch.crawl.CrawlDb.run(CrawlDb.java:219)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.nutch.crawl.CrawlDb.main(CrawlDb.java:179)

1 个答案:

答案 0 :(得分:1)

您是否尝试过执行Nutch WIKI中的步骤?