Question

我正在某个副本字段"nurses"上搜索查询a_cpy,b_cpy,c_cpy，这些副本字段分别是字段a,b,c的副本。

将阻止a,b,c中索引的值，而不会阻止a_cpy,b_cpy,c_cpy中索引的值。

我的hl.fl值为a,b,c，而qf为a_cpy,b_cpy,c_cpy and hl.q为"nurses"。

当搜索词为"nurses"时，solr的响应不会突出显示"nurse"，而是突出显示正确。

这是预期的行为还是我的方法有问题？

Answer 1

您已经提到一个字段具有词干过滤器，而另一字段没有词干过滤器。

要回答您的问题，这是正确的行为，没有错。通过下面的示例，我们将使用solr分析来了解为什么会发生这种情况。

对于名为text的字段，将使用以下没有词干过滤器工厂的字段类型。

<field name="text" type="text_general"/>

<fieldType name="text_general" class="solr.TextField" positionIncrementGap="100" multiValued="true">
    <analyzer type="index">
      <tokenizer class="solr.StandardTokenizerFactory"/>
      <filter class="solr.StopFilterFactory" words="stopwords.txt" ignoreCase="true"/>
      <filter class="solr.LowerCaseFilterFactory"/>
    </analyzer>
    <analyzer type="query">
      <tokenizer class="solr.StandardTokenizerFactory"/>
      <filter class="solr.StopFilterFactory" words="stopwords.txt" ignoreCase="true"/>
      <filter class="solr.SynonymGraphFilterFactory" expand="true" ignoreCase="true" synonyms="synonyms.txt"/>
      <filter class="solr.LowerCaseFilterFactory"/>
    </analyzer>
  </fieldType>

在solr分析页面中分析上述文本字段的数据时，您会发现它与数据不匹配。

之所以不匹配，是因为索引数据（在过滤器工厂末尾创建的令牌）与查询值不同。

对于名为text_copy_stemmed的字段，将使用以下具有干式过滤器工厂的字段类型。我们在建立索引时使用了<filter class="solr.KStemFilterFactory"/>。

<field name="text_copy_stemmed" type="text_general_stemmed"/>

<copyField source="text" dest="text_copy_stemmed" indexed="true" stored="true"/>

<fieldType name="text_general_stemmed" class="solr.TextField" positionIncrementGap="100" multiValued="true">
    <analyzer type="index">
      <tokenizer class="solr.StandardTokenizerFactory"/>
      <filter class="solr.StopFilterFactory" words="stopwords.txt" ignoreCase="true"/>
      <filter class="solr.LowerCaseFilterFactory"/>
      <filter class="solr.KStemFilterFactory"/>
    </analyzer>
    <analyzer type="query">
      <tokenizer class="solr.StandardTokenizerFactory"/>
      <filter class="solr.StopFilterFactory" words="stopwords.txt" ignoreCase="true"/>
      <filter class="solr.SynonymGraphFilterFactory" expand="true" ignoreCase="true" synonyms="synonyms.txt"/>
      <filter class="solr.LowerCaseFilterFactory"/>
    </analyzer>
  </fieldType>

在solr分析页面中分析上述text_copy_stemmed字段的数据时，您会发现它确实与数据匹配。

在solr中找到令牌时，查询的数据将匹配。验证在过滤器工厂末尾创建的令牌以及通过查询传递的令牌。

我已经索引了下面的JSON，并突出显示了相同的数据。

{
"id":"gb18030-example.xml",
"text":"jump jumping jumped organizational organizations",
"text_copy_stemmed":"jump jumping jumped organizational organizations"
}

qf和hl.fl不同时，Solr突出显示中断

1 个答案: