Question

我的用户想要使用特殊字符和通配符进行搜索。在这种情况下，破折号 (-)。

所以如果我用 'xxx' 或 'xxx\-' 搜索我得到包含“xxx-”和“xxx”的结果。但我不想要“xxx”，我只想要带有“xxx-”的结果。（破折号）

我尝试用 xxx-* 搜索，没有给我任何结果。

架构看起来像这样

<fieldType name="text_general" class="solr.TextField" positionIncrementGap="100" multiValued="false">
<analyzer type="index">
  <tokenizer class="solr.StandardTokenizerFactory"/>
  <filter class="solr.StopFilterFactory" words="stopwords.txt" ignoreCase="true"/>
  <filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
   <tokenizer class="solr.StandardTokenizerFactory"/>
  <filter class="solr.StopFilterFactory" words="stopwords.txt" ignoreCase="true"/>
  <filter class="solr.SynonymFilterFactory" expand="true" ignoreCase="true" synonyms="synonyms.txt"/>
  <filter class="solr.LowerCaseFilterFactory"/>
</analyzer>

知道如何实现这一目标吗？

Answer 1

我通过以下方式实现了

<fieldType name="string" class="solr.StrField" sortMissingLast="true"/>
<field indexed="true" multiValued="false" name="field_name" stored="true" type="string"/>

使用 field_name=xxx-* 进行搜索只会为我提供以 xxx- 开头的字段。

Answer 2

了解问题。

<块引用>

标准分词器
此分词器将文本字段拆分为分词，处理空格和标点符号作为分隔符和分隔符被丢弃。

就您而言，- 是标点符号，说明您搜索 xxx- 时没有结果。

您可以做的是将 StandardTokenizerFactory 替换为 WhitespaceTokenizerFactory，这将在空白处拆分文本流，仅保留标记中的标点符号。

带有特殊字符的 Solr 通配符搜索

2 个答案: