我使用网址抓取网站并从这些网站获取数据.....我使用solr 3.4.0和nutch 1.9。
它工作正常,但现在突然从上周我得到这个错误:
2015-06-18 18:32:49,718 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.anchor.AnchorIndexingFilter
2015-06-18 18:32:53,531 INFO solr.SolrMappingReader - source: content dest: content
2015-06-18 18:32:53,531 INFO solr.SolrMappingReader - source: site dest: site
2015-06-18 18:32:53,531 INFO solr.SolrMappingReader - source: title dest: title
2015-06-18 18:32:53,531 INFO solr.SolrMappingReader - source: host dest: host
2015-06-18 18:32:53,531 INFO solr.SolrMappingReader - source: segment dest: segment
2015-06-18 18:32:53,531 INFO solr.SolrMappingReader - source: boost dest: boost
2015-06-18 18:32:53,531 INFO solr.SolrMappingReader - source: digest dest: digest
2015-06-18 18:32:53,531 INFO solr.SolrMappingReader - source: tstamp dest: tstamp
2015-06-18 18:32:53,531 INFO solr.SolrMappingReader - source: url dest: id
2015-06-18 18:32:53,531 INFO solr.SolrMappingReader - source: url dest: url
2015-06-18 18:32:54,484 INFO solr.SolrWriter - Adding 1000 documents
2015-06-18 18:34:34,156 WARN mapred.LocalJobRunner - job_local_0030
org.apache.solr.common.SolrException: Internal Server Error
Internal Server Error
request: http://host IP:port/solr/news/update?wt=javabin&version=2
at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:430)
at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:244)
at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:49)
at org.apache.nutch.indexer.solr.SolrWriter.write(SolrWriter.java:81)
at org.apache.nutch.indexer.IndexerOutputFormat$1.write(IndexerOutputFormat.java:54)
at org.apache.nutch.indexer.IndexerOutputFormat$1.write(IndexerOutputFormat.java:44)
at org.apache.hadoop.mapred.ReduceTask$3.collect(ReduceTask.java:440)
at org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:166)
at org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:51)
at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:463)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:411)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216)
2015-06-18 18:34:35,062 ERROR solr.SolrIndexer - java.io.IOException: Job failed!
2015-06-18 18:34:35,140 INFO solr.SolrDeleteDuplicates - SolrDeleteDuplicates: starting at 2015-06-18 18:34:35
2015-06-18 18:34:35,140 INFO solr.SolrDeleteDuplicates - SolrDeleteDuplicates: Solr url: http://Host IP:Port/solr/news
2015-06-18 18:36:52,718 WARN mapred.LocalJobRunner - job_local_0031
java.lang.NullPointerException
at org.apache.hadoop.io.Text.encode(Text.java:388)
at org.apache.hadoop.io.Text.set(Text.java:178)
at org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat$1.next(SolrDeleteDuplicates.java:270)
at org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat$1.next(SolrDeleteDuplicates.java:241)
at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:192)
at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:176)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
任何可以帮我解决此错误的人。谢谢。
solr日志给我这个错误:org.apache.solr.update.SolrIndexWriter finalize 严重:SolrIndexWriter在finalize()之前没有关闭,表示一个错误 - 可能的资源泄漏!!!
org.apache.solr.common.SolrException日志 SEVERE:org.apache.lucene.store.LockObtainFailedException:Lock获取超时:NativeFSLock