<fieldType name="text_general" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
<!-- in this example, we will only use synonyms at query time
<filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
-->
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.ShingleFilterFactory" maxShingleSize="5" minShingleSize="2" outputUnigrams="true"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.ShingleFilterFactory" maxShingleSize="5" minShingleSize="2" outputUnigrams="true"/>
<solrQueryParser defaultOperator="OR" />
</analyzer>
</fieldType>
我正在使用ShingleFilterFactory来创建ngrams。现在我想要检索特定单词的所有ngrams。 假设我进入了#34;晚上&#34;然后我想要所有带有夜晚的ngram。
现在,通过以下查询,我从我的文档中获得了所有ngrams的唯一结果:
http://localhost/solr/admin/luke?fl=text&numTerms=50000&wt=json