我的德语单词有问题。 Solr(版本4.0.0)tokenzie将Kälte改为两个错误的令牌。也许我对德语文本字段的定义错误。
字段的定义如下。
<fieldType name="text_de" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_de.txt" format="snowball" />
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.GermanNormalizationFilterFactory"/>
<filter class="solr.SnowballPorterFilterFactory" language="German2"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_de.txt" format="snowball" />
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.GermanNormalizationFilterFactory"/>
<filter class="solr.SnowballPorterFilterFactory" language="German2"/>
</analyzer>
调试查询:
<str name="parsedquery">text_de:kã text_de:lte</str><str name="parsedquery_toString">text_de:kã text_de:lte</str>
答案 0 :(得分:1)
如果您正在运行Tomcat作为应用程序容器,则可以尝试在AJP / 1.3 Connector上修改server.xml文件并添加URIEncoding =“UTF8”。我找到了Solution。