使用螺母和Solr进行索引和爬网

时间:2018-10-11 11:16:42

标签: solr web-crawler nutch full-text-indexing

作为新手,我尝试使用Nutch与solr索引和爬网单个网站,但出现此错误。,我不知道确切的错误是什么。谁能帮我这个 ?预先感谢。

Segment dir is complete: crawl/segments/20181011155036.
Segment dir is complete: crawl/segments/20181011155140.
Indexer: starting at 2018-10-11 15:54:06
Indexer: deleting gone documents: false
Indexer: URL filtering: false
Indexer: URL normalizing: false
No exchange was configured. The documents will be routed to all index writers.
Active IndexWriters :
SOLRIndexWriter
    type : Type of the server. Can be: "cloud", "concurrent", "http" or "lb"
    url : URL of the SOLR instance or URL of the Zookeeper quorum
    commitSize : buffer size when sending to SOLR (default 1000)
    auth : use authentication (default false)
    username : username for authentication
    password : password for authentication


Indexing 20/20 documents
Deleting 0 documents
Indexing job did not succeed, job status:FAILED, reason: NA
Indexer: java.lang.RuntimeException: Indexing job did not succeed, job status:FAILED, reason: NA
    at org.apache.nutch.indexer.IndexingJob.index(IndexingJob.java:152enter code here)
    at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:235)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
    at org.apache.nutch.indexer.IndexingJob.main(IndexingJob.java:244)

0 个答案:

没有答案