我发现它支持很多语言而不是中文。
更新 我可以使用Luke搜索我的数据,请看截图 试过这个: 添加
<fieldType name="text_chinese" class="solr.TextField">
<analyzer type="index">
<tokenizer class="solr.CJKTokenizerFactory"/>
</analyzer>
</fieldType>
到schema.xml,也尝试了
<fieldType name="text_chinese" class="solr.TextField">
<analyzer class="org.apache.lucene.analysis.cn.ChineseAnalyzer"/>
</fieldType>
和
<fieldType name="text_general" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.CJKTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
但这一切都没有。
我想知道“<fieldType name=
”的格式究竟是什么,它是text_chinese还是text_cn或text_zh_cn?但我尝试了所有这些格式。
顺便说一下,我可以使用engnlish属性搜索我的文档。我无法通过输入搜索 在我的网络浏览器用户界面中“名字:某事”,我有例外:
org.springframework.dao.InvalidDataAccessApiUsageException: Bad Request
Bad Request
request: http://localhost:8180/solr/select?q=name:*name:something* or description:*name:something* or type:*name:something* or mac_address:*name:something* or uri:*name:something* or attrs:*name:something*&start=0&rows=0&wt=javabin&version=2; nested exception is org.apache.solr.common.SolrException: Bad Request
Bad Request
request: http://localhost:8180/solr/select?q=name:*name:something* or description:*name:something* or type:*name:something* or mac_address:*name:something* or uri:*name:something* or attrs:*name:something*&start=0&rows=0&wt=javabin&version=2
at org.springframework.data.solr.core.SolrExceptionTranslator.translateExceptionIfPossible(SolrExceptionTranslator.java:58)
at org.springframework.data.solr.core.SolrTemplate.execute(SolrTemplate.java:106)
at org.springframework.data.solr.core.SolrTemplate.count(SolrTemplate.java:126)
at org.springframework.data.solr.repository.query.AbstractSolrQuery$CollectionExecution.count(AbstractSolrQuery.java:92)
at org.springframework.data.solr.repository.query.AbstractSolrQuery$CollectionExecution.execute(AbstractSolrQuery.java:87)
at org.springframework.data.solr.repository.query.AbstractSolrQuery.execute(AbstractSolrQuery.java:52)
at org.springframework.data.repository.core.support.RepositoryFactorySupport$QueryExecutorMethodInterceptor.invoke(RepositoryFactorySupport.java:313)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:202)
这是一个弹簧数据solr错误吗?
答案 0 :(得分:0)
SmartChineseAnalyzer似乎是一个典型的分析器实现,用于索引中文文本。
smartcn API中提供了更多工具,CJKTokenizer是中文(以及日语和韩语,因此名称)的良好标记符。
答案 1 :(得分:0)
<fieldType name="wc_text" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.ChineseTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="0" preserveOriginal="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.ChineseTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="0" preserveOriginal="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
</fieldType
将上述过滤器工厂用于中文字符,并在应用程序端使用UTF-8编码。