在solr中检索特定单词的ngrams

时间:2014-11-19 06:23:44

标签: solr

<fieldType name="text_general" class="solr.TextField" positionIncrementGap="100">
  <analyzer type="index">
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
    <!-- in this example, we will only use synonyms at query time
    <filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
    -->
    <filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.ShingleFilterFactory" maxShingleSize="5" minShingleSize="2" outputUnigrams="true"/>
  </analyzer>
  <analyzer type="query">
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
    <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
    <filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.ShingleFilterFactory" maxShingleSize="5" minShingleSize="2" outputUnigrams="true"/>
<solrQueryParser defaultOperator="OR" />
  </analyzer>
</fieldType>

我正在使用ShingleFilterFactory来创建ngrams。现在我想要检索特定单词的所有ngrams。 假设我进入了#34;晚上&#34;然后我想要所有带有夜晚的ngram。

现在,通过以下查询,我从我的文档中获得了所有ngrams的唯一结果:

http://localhost/solr/admin/luke?fl=text&numTerms=50000&wt=json

0 个答案:

没有答案