Solr Indexing&搜索

时间:2016-11-08 09:06:21

标签: solr

有人可以建议我使用solr进行常规搜索我的产品的最佳方式,它也应该支持同义词&模糊搜索呢?

如果我用单词cro搜索,我需要以cro开头的产品,然后在字段中的任何地方出现cro,然后是同义词&然后在相应的提升中模糊它

1 个答案:

答案 0 :(得分:0)

我的项目有一个场景。

我一起使用 FuzzyLookupFactory (有多个建议者)和 AnalyzingInfixLookupFactory 。我正在使用solrj(java api)进行请求查询。

首先我通过analyzeinfixlookupfactory搜索单词,这可以在字段中的任何地方找到单词,但您必须正确输入。例如产品名称“ toshiba ”,如果您搜索“ tosh ”,它可以正确找到“ toshiba ”,但如果搜索“ toshhiba” “它找不到任何产品。

这次我使用的是fuzzylookupfactory建议我拆分所有单词(例如你有“toshiba笔记本电脑”word1 = toshiba word2 =笔记本电脑等)并且一对一搜索并且模糊将找到toshba - > toshiba和找到toshiba后再次使用analyzeinfixlookupfactory查找完整的产品领域。

例如,您想要找到“ toshiba laptop ”并搜索“ toshba laptp ”。首先尝试“analyzeinfixlookupfactory”,响应将为null。响应后null搜索每个单词的模糊,并添加你发现的像toshba - > toshiba + laptp->笔记本电脑它将toshiba笔记本电脑,现在你可以再次搜索analyzeinfixlookupfactory完整的领域。

我在(solrconfig.xml)中的分析建议器

<searchComponent name="suggestAnalyzing" class="solr.SuggestComponent">
<lst name="suggester">
<str name="name">suggestAnalyzing</str>
<str name="lookupImpl">AnalyzingInfixLookupFactory</str>
<str name="classname">org.apache.solr.spelling.suggest.Suggester</str>
<str name="storeDir">suggester_fuzzy_dir</str>
<str name="indexPath">suggester_infix_dir</str>
<str name="dictionaryImpl">DocumentDictionaryFactory</str>  
<str name="field">COMPLETE_FIELD</str>
<str name="suggestAnalyzerFieldType">textgen</str>

 <float name="threshold">0.005</float>
 <str name="buildOnStartup">false</str>
 <str name="buildOnCommit">false</str>
 </lst>


 </searchComponent>

 <requestHandler name="/suggestAnalyzing" class="solr.SearchHandler">
 <lst name="defaults">
 <str name="suggest.dictionary">suggestAnalyzing</str>
 <str name="suggest">true</str>
 <str name="suggest.count">10</str>

 </lst>
 <arr name="components">
 <str>suggestAnalyzing</str>
 </arr>
 </requestHandler>

我的模糊建议器(solrconfig.xml)

<searchComponent class="solr.SuggestComponent" name="suggest">

<lst name="suggester">
<str name="name">word1suggester</str>
<str name="lookupImpl">FuzzyLookupFactory</str>
<str name="dictionaryImpl">DocumentDictionaryFactory</str>
<str name="field">word1</str>  <!-- the indexed field to derive suggestions from -->   
<str name="suggestAnalyzerFieldType">textgen</str>
<str name="storeDir">suggest_fuzzy_doc_expr_dict</str>
<str name="buildOnStartup">false</str>
<str name="buildOnCommit">false</str>
</lst>

<lst name="suggester">
<str name="name">word2suggester</str>
<str name="lookupImpl">FuzzyLookupFactory</str>
<str name="dictionaryImpl">DocumentDictionaryFactory</str>
<str name="field">word2</str>  <!-- the indexed field to derive suggestions from -->
<str name="suggestAnalyzerFieldType">textgen</str>
<str name="indexPath">suggestions/word2suggester</str>
<str name="storeDir">suggest_fuzzy_doc_expr_dict</str>
<str name="buildOnStartup">false</str>
<str name="buildOnCommit">false</str>
</lst>

<lst name="suggester">
<str name="name">word3suggester</str>
<str name="lookupImpl">FuzzyLookupFactory</str>
<str name="dictionaryImpl">DocumentDictionaryFactory</str>
<str name="field">word3</str>  <!-- the indexed field to derive suggestions from -->
<str name="suggestAnalyzerFieldType">textgen</str>
<str name="indexPath">suggestions/word3suggester</str>
<str name="storeDir">suggest_fuzzy_doc_expr_dict</str>
<str name="preserveSep">true</str>
<str name="preservePositionIncrements">true</str>
<str name="exactMatchFirst">true</str> 
<float name="threshold">0.005</float> 
<str name="buildOnStartup">false</str>
<str name="buildOnCommit">false</str>

</lst>

<lst name="suggester">
<str name="name">word4suggester</str>
<str name="lookupImpl">FuzzyLookupFactory</str>
<str name="dictionaryImpl">DocumentDictionaryFactory</str>
<str name="field">word4</str>  <!-- the indexed field to derive suggestions from -->
<str name="suggestAnalyzerFieldType">textgen</str>
<str name="indexPath">suggestions/word4suggester</str>
<str name="storeDir">suggest_fuzzy_doc_expr_dict</str>
<str name="preserveSep">true</str>
<str name="preservePositionIncrements">true</str>
<str name="buildOnStartup">false</str>
<str name="buildOnCommit">false</str>
</lst>

<lst name="suggester">
<str name="name">word5suggester</str>
<str name="lookupImpl">FuzzyLookupFactory</str>
<str name="dictionaryImpl">DocumentDictionaryFactory</str>
<str name="field">word5</str>  <!-- the indexed field to derive suggestions from -->
<str name="suggestAnalyzerFieldType">textgen</str>
<str name="indexPath">suggestions/word5suggester</str>
<str name="storeDir">suggest_fuzzy_doc_expr_dict</str>
<str name="preserveSep">true</str>
<str name="preservePositionIncrements">true</str>
<str name="buildOnStartup">false</str>
<str name="buildOnCommit">false</str>
</lst>

<lst name="suggester">
<str name="name">word6suggester</str>
<str name="lookupImpl">FuzzyLookupFactory</str>
<str name="dictionaryImpl">DocumentDictionaryFactory</str>
<str name="field">word6</str>  <!-- the indexed field to derive suggestions from -->
<str name="suggestAnalyzerFieldType">textgen</str>
<str name="indexPath">suggestions/word6suggester</str>
<str name="storeDir">suggest_fuzzy_doc_expr_dict</str>
<str name="preserveSep">true</str>
<str name="preservePositionIncrements">true</str>
<str name="exactMatchFirst">true</str> 
<float name="threshold">0.005</float> 
<str name="buildOnStartup">false</str>
<str name="buildOnCommit">false</str>
</lst>


<lst name="suggester">
<str name="name">word7suggester</str>
<str name="lookupImpl">FuzzyLookupFactory</str>
<str name="dictionaryImpl">DocumentDictionaryFactory</str>
<str name="field">word7</str>  <!-- the indexed field to derive suggestions from -->
<str name="suggestAnalyzerFieldType">textgen</str>
<str name="indexPath">suggestions/word7suggester</str>
<str name="storeDir">suggest_fuzzy_doc_expr_dict</str>
<str name="preserveSep">true</str>
<str name="preservePositionIncrements">true</str>
<str name="exactMatchFirst">true</str> 
<float name="threshold">0.005</float> 
<str name="buildOnStartup">false</str>
<str name="buildOnCommit">false</str>
</lst>

<lst name="suggester">
<str name="name">word8suggester</str>
<str name="lookupImpl">FuzzyLookupFactory</str>
<str name="dictionaryImpl">DocumentDictionaryFactory</str>
<str name="field">word8</str>  <!-- the indexed field to derive  suggestions from -->
<str name="suggestAnalyzerFieldType">textgen</str>
<str name="indexPath">suggestions/word8suggester</str>
<str name="storeDir">suggest_fuzzy_doc_expr_dict</str>
<str name="preserveSep">true</str>
<str name="preservePositionIncrements">true</str>
<str name="exactMatchFirst">true</str> 
<float name="threshold">0.005</float> 
<str name="buildOnStartup">false</str>
<str name="buildOnCommit">false</str>
</lst>


</searchComponent>

<requestHandler class="solr.SearchHandler" name="/suggest">
<lst name="defaults">
<str name="spellcheck">true</str>
<str name="suggest">true</str>
<str name="suggest.dictionary">word1suggester</str>
<str name="suggest.dictionary">word2suggester</str>
<str name="suggest.dictionary">word3suggester</str>
<str name="suggest.dictionary">word4suggester</str>
<str name="suggest.dictionary">word5suggester</str>
<str name="suggest.dictionary">word6suggester</str>
<str name="suggest.dictionary">word7suggester</str>
<str name="suggest.dictionary">word8suggester</str>

<str name="spellcheck.count">10</str>

</lst>
<arr name="components">
<str>suggest</str>
</arr>
</requestHandler> 

我的托管架构字段

<fieldType name="textgen" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="0"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
<filter class="solr.StopFilterFactory"
        ignoreCase="true"
        words="stopwords.txt"

        />
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="0"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>

<field name="word1" type="textgen" omitNorms="true" omitTermFreqAndPositions="true" multiValued="true" indexed="true" stored="true"/>

<field name="COMPLETE_FIELD" type="textgen" multiValued="true" indexed="true" stored="true"/>

<field name="word2" type="text_general" omitNorms="true" omitTermFreqAndPositions="true" multiValued="true" indexed="true" stored="true"/>

<field name="word3" type="text_general" omitNorms="true" omitTermFreqAndPositions="true" multiValued="true" indexed="true" stored="true"/>

<field name="word4" type="textgen" omitNorms="true" omitTermFreqAndPositions="true" multiValued="true" indexed="true" stored="true"/>

<field name="word5" type="textgen" omitNorms="true" omitTermFreqAndPositions="true" multiValued="true" indexed="true" stored="true"/>

<field name="word6" type="textgen" omitNorms="true" omitTermFreqAndPositions="true" multiValued="true" indexed="true" stored="true"/>

<field name="word7" type="textgen" omitNorms="true" omitTermFreqAndPositions="true" multiValued="true" indexed="true" stored="true"/>
<field name="word8" type="textgen" omitNorms="true" omitTermFreqAndPositions="true" multiValued="true" indexed="true" stored="true"/>