Apache Solr:我在Cygwin Terminal中运行了这个命令。 ./nutch crawl urls -dir newCrawl -solr http:// localhost:8939 / solr / -depth 10 -topN 10

时间:2015-10-15 05:32:47

标签: java indexing solr cygwin nutch

SolrIndexer:从2015-10-15 10:13:00开始 添加90个文档:

java.io.IOException: Job failed!
SolrDeleteDuplicates: starting at 2015-10-15 10:13:11
SolrDeleteDuplicates: Solr url: http://localhost:8939/solr/
Exception in thread "main" java.io.IOException: org.apache.solr.client.solrj.SolrServerException: java.net.ConnectException: Connection refused: connect
        at org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat.getSplits(SolrDeleteDuplicates.java:200)
        at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:810)
        at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:781)
        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:730)
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1249)
        at org.apache.nutch.indexer.solr.SolrDeleteDuplicates.dedup(SolrDeleteDuplicates.java:373)
        at org.apache.nutch.indexer.solr.SolrDeleteDuplicates.dedup(SolrDeleteDuplicates.java:353)
        at org.apache.nutch.crawl.Crawl.run(Crawl.java:153)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at org.apache.nutch.crawl.Crawl.main(Crawl.java:55)
Caused by: org.apache.solr.client.solrj.SolrServerException: java.net.ConnectException: Connection refused: connect
        at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:478)
        at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:244)
        at org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:89)
        at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:118)
        at org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat.getSplits(SolrDeleteDuplicates.java:198)
        ... 9 more
Caused by: java.net.ConnectException: Connection refused: connect
        at java.net.DualStackPlainSocketImpl.connect0(Native Method)
        at java.net.DualStackPlainSocketImpl.socketConnect(DualStackPlainSocketImpl.java:79)
        at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:345)
        at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
        at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:172)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
        at java.net.Socket.connect(Socket.java:589)
        at java.net.Socket.connect(Socket.java:538)
        at java.net.Socket.<init>(Socket.java:434)
        at java.net.Socket.<init>(Socket.java:286)
        at org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:80)
        at org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:122)
        at org.apache.commons.httpclient.HttpConnection.open(HttpConnection.java:707)
        at org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:387)
        at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171)
        at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
        at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323)
        at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:422)
        ... 13 more

如何解决此问题?

2 个答案:

答案 0 :(得分:1)

您的solr网址无效。你也需要编写核心。默认情况下,核心是collection1。因此,在您的情况下,网址为http://localhost:8983/solr/collection1

答案 1 :(得分:0)

Solr似乎没有运行 - 连接被拒绝表示某些东西无法联系它应该联系的任何东西。您的配置可能已损坏,或者您尚未启动Solr,或者您正在另一个端口上运行。