Question

你能告诉我使用Solr 3.1配置Tika 0.9的步骤

<requestHandler name="/update/extract" 
                  startup="lazy"
                  class="solr.extraction.ExtractingRequestHandler" >
    <lst name="defaults">
      <!-- All the main content goes into "text"... if you need to return
           the extracted text or do highlighting, use a stored field. -->
      <str name="fmap.content">text</str>
      <str name="lowernames">true</str>
      <str name="uprefix">ignored_</str>

      <!-- capture link hrefs but ignore div attributes -->
      <str name="captureAttr">true</str>
      <str name="fmap.a">links</str>
      <str name="fmap.div">ignored_</str>
    </lst>
  </requestHandler>

我在solrconfig.xml中使用这个配置请帮助我

谢谢，

Answer 1

假设您在solr中安装了Tika（和依赖项），那么您应该只需要这样做。

您是否阅读了ExtractingRequestHandler维基页面？它有很多信息，并且还有一些使用curl的配方，让你测试它是否都能正常工作。

如何使用Solr 3.1配置Tika 0.9

1 个答案: