Question

我使用Solr搜索我的数据，现在我认识到某些solr搜索查询语言功能不适合我。我从我的能力中想念这些：

模糊搜索
野果*？ - 到目前为止我没有设置词干，这对于搜索来说暂时有用
字段规范 - 目前我不能告诉标题中的搜索：Blabla

据我所知，这些东西应该在Solr中默认出现，但我显然没有它们。我使用Solr 1.4。在这里你可以找到my schema。谢谢你的帮助。

Answer 1

我用Google搜索“ solr模糊搜索”，我在这里找到了你的问题。实际上，SOLR 4.0版能够使用简单的查询语法进行模糊搜索。

例如，您可以搜索name:peter strict或使用波形符号name:peter~作为模糊搜索。如果您希望稍微限制模糊性，可以添加name:peter~0.7形式的百分比...这意味着您要搜索具有70％的“清晰度”的彼得。

Answer 2

您的fieldType name="text"缺少很多过滤器。作为参考，这是默认schema.xml中的文本fieldType：

<!-- A text field that uses WordDelimiterFilter to enable splitting and matching of
    words on case-change, alpha numeric boundaries, and non-alphanumeric chars,
    so that a query of "wifi" or "wi fi" could match a document containing "Wi-Fi".
    Synonyms and stopwords are customized by external files, and stemming is enabled.
    -->
<fieldType name="text" class="solr.TextField" positionIncrementGap="100">
  <analyzer type="index">
    <tokenizer class="solr.WhitespaceTokenizerFactory"/>
    <!-- in this example, we will only use synonyms at query time
    <filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
    -->
    <!-- Case insensitive stop word removal.
      add enablePositionIncrements=true in both the index and query
      analyzers to leave a 'gap' for more accurate phrase queries.
    -->
    <filter class="solr.StopFilterFactory"
            ignoreCase="true"
            words="stopwords.txt"
            enablePositionIncrements="true"
            />
    <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt"/>
  </analyzer>
  <analyzer type="query">
    <tokenizer class="solr.WhitespaceTokenizerFactory"/>
    <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
    <filter class="solr.StopFilterFactory"
            ignoreCase="true"
            words="stopwords.txt"
            enablePositionIncrements="true"
            />
    <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt"/>
  </analyzer>
</fieldType>

例如，SnowballPorterFilterFactory是启用词干的方法。

我建议根据默认的schema.xml构建模式，根据需要进行调整和修改（而不是从头开始）。

Here's the reference for analyzers, tokenizers and filters

如何使用Wildchards，使用Solr进行模糊搜索？

2 个答案: