我在solr
中有一个多值字段,其中包含用户名称
{
"counsel_for_department": [
"mr a g srivastava with mr xyz doe,
" mr johh david and mr john deo",
" mr n p smith and mr ng smith",
]
},
当我查询fq=counsel_for_department:a g srivastava
时,它不会返回任何结果。我在这个领域使用标准的标记器
此字段的字段类型为text_general
如果我们需要为多值字段配置不同的设置,请告诉我。
我正在关注json对象
{
"responseHeader": {
"status": 0,
"QTime": 20,
"params": {
"q": "*:*",
"indent": "true",
"fl": "counsel_for_department",
"fq": [
"doc_type:source_analysis",
"counsel_for_department:*g*c*Srivastava*"
],
"rows": "100",
"wt": "json",
"debugQuery": "true",
"_": "1459351342391"
}
},
"response": {
"numFound": 0,
"start": 0,
"docs": []
},
"debug": {
"rawquerystring": "*:*",
"querystring": "*:*",
"parsedquery": "MatchAllDocsQuery(*:*)",
"parsedquery_toString": "*:*",
"explain": {},
"QParser": "LuceneQParser",
"filter_queries": [
"doc_type:source_analysis",
"counsel_for_department:*g*c*Srivastava*"
],
"parsed_filter_queries": [
"doc_type:source_analysis",
"counsel_for_department:*g*c*srivastava*"
],
"timing": {
"time": 20,
"prepare": {
"time": 16,
"query": {
"time": 16
},
"facet": {
"time": 0
},
"facet_module": {
"time": 0
},
"mlt": {
"time": 0
},
"highlight": {
"time": 0
},
"stats": {
"time": 0
},
"expand": {
"time": 0
},
"debug": {
"time": 0
}
},
"process": {
"time": 3,
"query": {
"time": 3
},
"facet": {
"time": 0
},
"facet_module": {
"time": 0
},
"mlt": {
"time": 0
},
"highlight": {
"time": 0
},
"stats": {
"time": 0
},
"expand": {
"time": 0
},
"debug": {
"time": 0
}
}
}
}
}
提前致谢
答案 0 :(得分:1)
不分析通配符查询,因此在大多数情况下最好远离它们,而是使用术语匹配。这样你就可以匹配文件而不管术语的顺序如何,所以“john oliver”也会匹配“oliver john”,“john oliver”会根据短语匹配得到提升。
要扩展,通配符匹配将发生的唯一方法是,如果基础数据集中的实际令牌匹配 - 并且如果您有一个令牌化器和过滤器链,通常,它不会在您抛出空格时立即混合。
删除通配符并使用正确的匹配(这是Solr真正做得很好的)。
答案 1 :(得分:0)
对于纯文本搜索,您应该去:
fq=counsel_for_department:*a g srivastava*
//OR you can also use :
fq=counsel_for_department:*a*g*srivastava*
首先使用这样的。但它在SOLR中是一个相对昂贵/缓慢的查询。 作为改进,如果此查询非常昂贵(花费太多时间),则应在1个统一字段中转换多值字段。并查询该字段而不是多值字段。