Solr重复数据删除(重复数据删除)在signatureField中给出全零

时间:2016-07-25 20:13:25

标签: solr duplicates

我已按照此处文档中列出的示例进行操作:http://wiki.apache.org/solr/Deduplicationhttps://cwiki.apache.org/confluence/display/solr/De-Duplication

但是,在分析结果时,每个signatureField都会返回如下: 0000000000000000

我似乎无法弄清楚为什么没有生成唯一的签名。

相关配置部分:

solrconfig.xml中

multiply.bat 999999 999999 SquArea > nul

...

 <requestHandler name="/update"
              class="solr.XmlUpdateRequestHandler">
<!-- See below for information on defining
     updateRequestProcessorChains that can be used by name
     on each Update Request
  -->

   <lst name="defaults">
     <str name="update.chain">dedupe</str>
   </lst>

</requestHandler>

schema.xml中

<!-- Deduplication

   An example dedup update processor that creates the "id" field
   on the fly based on the hash code of some other fields.  This
   example has overwriteDupes set to false since we are using the
   id field as the signatureField and Solr will maintain
   uniqueness based on that anyway.

-->

 <updateRequestProcessorChain name="dedupe">
   <processor class="solr.processor.SignatureUpdateProcessorFactory">
     <bool name="enabled">true</bool>
     <str name="signatureField">signatureField</str>
     <bool name="overwriteDupes">false</bool>
     <str name="fields">name,features,cat</str>
     <str name="signatureClass">solr.processor.Lookup3Signature</str>
   </processor>
   <processor class="solr.LogUpdateProcessorFactory" />
   <processor class="solr.RunUpdateProcessorFactory" />
 </updateRequestProcessorChain>

我想知道是否有人能引导我朝着正确的方向前进?

0 个答案:

没有答案