加载到solr时Nutch消息“没有IndexWriters激活”

时间:2013-07-15 08:13:04

标签: solr nutch

我按照nutch教程http://wiki.apache.org/nutch/NutchTutorial运行了nutch crawler但是当我开始将它加载到solr时我收到了这条消息,即“没有激活IndexWriters - 检查你的配置

bin/nutch solrindex http://localhost:8983/solr crawl/crawldb/ -dir crawl/segments/
Indexer: starting at 2013-07-15 08:09:13
Indexer: deleting gone documents: false
Indexer: URL filtering: false
Indexer: URL normalizing: false
**No IndexWriters activated - check your configuration**

Indexer: finished at 2013-07-15 08:09:21, elapsed: 00:00:07

4 个答案:

答案 0 :(得分:6)

确保包含插件indexer-solr。转到文件:conf/nutch-site.xml并在属性plugin.includes中添加插件,例如:

  

协议HTTP | urlfilter正则表达式| parse-(HTML | TIKA)|索引 - (基本|锚)|索引-solr的|记分OPIC | urlnormalizer-(传递|正则表达式|基本)

添加插件后,No IndexWriters activated - check your configuration警告在我的情况下消失了。

检查此主题:http://lucene.472066.n3.nabble.com/a-plugin-extending-IndexWriter-td4074353.html

答案 1 :(得分:2)

@Tryskele + @ Scott101为我工作:

将plugin.includes属性添加到/conf/nutch-site.xml和runtime / local / conf / nutch-site.xml文件中:

<property>
  <name>plugin.includes</name>
  <value>protocol-httpclient|urlfilter-regex|index-(basic|more)|query-(basic|site|url|lang)|indexer-solr|nutch-extensionpoints|protocol-httpclient|urlfilter-regex|parse-(text|html|msexcel|msword|mspowerpoint|pdf)|summary-basic|scoring-opic|urlnormalizer-(pass|regex|basic)protocol-http|urlfilter-regex|parse-(html|tika|metatags)|index-(basic|anchor|more|metadata)</value>
</property>

答案 2 :(得分:0)

不知道这仍然是一个问题,但我遇到了这个问题,然后意识到我的src/plugin/build.xml错过了indexer-solr插件。添加以下内容然后重新编译nutch为我修复它:

<ant dir="indexer-solr" target="deploy"/>

答案 3 :(得分:0)

在conf / nutch-site.xml中添加以下属性以获取插件

<property>
<name>plugin.includes</name>
<value>protocol-httpclient|urlfilter-regex|index-(basic|more)|query-(basic|site|url|lang)|indexer-solr|nutch-extensionpoints|protocol-httpclient|urlfilter-regex|parse-(text|html|msexcel|msword|mspowerpoint|pdf)|summary-basic|scoring-opic|urlnormalizer-(pass|regex|basic)protocol-http|urlfilter-regex|parse-(html|tika|metatags)|index-(basic|anchor|more|metadata)</value>
</property>

如果它能解决您的问题,请告诉我。