这可能是一个重复的问题,但无法找到与此相关的内容:
我已经为城市和地区列表实施了solr建议。我有用户FuzzyLookupFactory。我的架构如下所示:
<fieldType name="suggestTypeLc" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<charFilter class="solr.PatternReplaceCharFilterFactory" pattern="[^a-zA-Z0-9]" replacement=" " />
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
synonym.txt用于将旧城市名称与新城市名称映射,例如Madras =&gt; Chennai,Saigon =&gt;胡志明市
我的建议定义如下:
<searchComponent name="suggest" class="solr.SuggestComponent">
<lst name="suggester">
<str name="name">suggestions</str>
<str name="lookupImpl">FuzzyLookupFactory</str>
<str name="dictionaryImpl">DocumentDictionaryFactory</str>
<str name="field">searchfield</str>
<str name="weightField">searchscore</str>
<str name="suggestAnalyzerFieldType">suggestTypeLc</str>
<str name="buildOnStartup">false</str>
<str name="buildOnCommit">false</str>
<str name="storeDir">autosuggest_dict</str>
</lst>
</searchComponent>
我的请求处理程序如下所示:
<requestHandler name="/suggest" class="solr.SearchHandler" startup="lazy">
<lst name="defaults">
<str name="suggest">true</str>
<str name="suggest.count">10</str>
<str name="suggest.dictionary">suggestions</str>
<str name="suggest.dictionary">results</str>
</lst>
<arr name="components">
<str>suggest</str>
</arr>
</requestHandler>
现在问题是建议者首先显示完全匹配但是区分大小写。例如,
/suggest?suggest.q=mumbai(以小写“m”开头)
将在第4位给出确切的结果:
{
"responseHeader":{
"status":0,
"QTime":19},
"suggest":{
"suggestions":{
"mumbai":{
"numFound":10,
"suggestions":[{
"term":"Mumbai Domestic Airport",
"weight":11536},
{
"term":"Mumbai Chhatrapati Shivaji Intl Airport",
"weight":11376},
{
"term":"Mumbai Pune Highway",
"weight":2850},
{
"term":"Mumbai",
"weight":2248},
.....
然而,调用/suggest?suggest.q=Mumbai(以大写字母“M”开头)
在第一名给出了确切的结果:
{
"responseHeader":{
"status":0,
"QTime":16},
"suggest":{
"suggestions":{
"Mumbai":{
"numFound":10,
"suggestions":[{
"term":"Mumbai",
"weight":2248},
{
"term":"Mumbai Domestic Airport",
"weight":11536},
{
"term":"Mumbai Chhatrapati Shivaji Intl Airport",
"weight":11376},
{
"term":"Mumbai Pune Highway",
"weight":2850},
...
我在这里缺少什么?即使从小写“mumbai”作为查询调用孟买,也可以做什么来使孟买成为第一个结果。我认为区分大小写是由我生成的“suggestTypeLc”字段处理的。
答案 0 :(得分:1)
FuzzyLookupFactory隐藏的配置参数为exactMatchFirst
,其描述为:
如果为true,则首先返回默认的确切建议,即使它们是前缀或FST中的其他字符串具有更大的权重。
根据您的配置建议按searchscore
字段排名(在您的配置中,它引用:<str name="weightField">searchscore</str>
)。这就是为什么当您查询mumbai
时,所有建议都按权重排序。
但是根据exactMatchFirst=true
,尽管提供了加权机制,但您会在Mumbai
之上(对于查询= Mumbai
)。这实际上是exactMatchFirst
影响排序的方式。
不幸的是,我没有找到调整你的建议者的选项,而不是完全摆脱weightField
。
尝试关闭字段加权或尝试其他查找实现,例如AnalyzingInfixLookupFactory。