Solr dedup error失败,退出值为255

时间:2015-01-28 05:53:41

标签: java apache solr web-crawler nutch

我使用apache nutch 2.3从网上抓取一些数据。我的solr版本是4.10.3。数据在hbase中成功爬行,也在solr中编入索引但在结束时(重复数据删除阶段)控制台中出现Follwoing错误;

IndexingJob: done.
SOLR dedup -> http://solr:8983/solr
/home/crawler/nutch-2.3/bin/nutch solrdedup -D mapred.reduce.tasks=2 -D mapred.child.java.opts=-Xmx1000m -D mapred.reduce.tasks.speculative.execution=false -D mapred.map.tasks.speculative.execution=false -D mapred.compress.map.output=true http://solr:8983/solr
Error running:
  /home/crawler/nutch-2.3/bin/nutch solrdedup -D mapred.reduce.tasks=2 -D mapred.child.java.opts=-Xmx1000m -D mapred.reduce.tasks.speculative.execution=false -D mapred.map.tasks.speculative.execution=false -D mapred.compress.map.output=true http://solr:8983/solr
Failed with exit value 255.

其中solr是运行apache solr的机器的IP。在apache nutch日志文件中对应的错误(详见以下)

2015-01-28 10:39:47,830 WARN  mapred.FileOutputCommitter - Output path is null in cleanup
2015-01-28 10:39:47,830 WARN  mapred.LocalJobRunner - job_local345700287_0001
java.lang.Exception: java.lang.NullPointerException
        at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:354)
Caused by: java.lang.NullPointerException
        at org.apache.hadoop.io.Text.encode(Text.java:388)
        at org.apache.hadoop.io.Text.set(Text.java:178)
        at org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrRecordReader.nextKeyValue(SolrDeleteDuplicates.java:233)
        at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:531)
        at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
        at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:223)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

nutch或solr有什么问题?如何做到这一点?

0 个答案:

没有答案