我在ElasticSearch索引中插入了3条记录,如下所示:
curl -XPOST 'http://127.0.0.1:9200/geoindex_test/STREET?pretty=1' -d '
{ "cityNames" : [ { "language" : "ENG",
"name" : "w bridgewater",
"raw_name" : "W BRIDGEWATER"
},
{ "language" : "ENG",
"name" : "west bridgewater",
"raw_name" : "West Bridgewater"
}
],
"id" : 1,
"streetNames" : [ { "language" : "ENG",
"name" : "cram rd",
"raw_name" : "Cram Rd"
} ]
}'
curl -XPOST 'http://127.0.0.1:9200/geoindex_test/STREET?pretty=1' -d '
{ "cityNames" : [ { "language" : "ENG",
"name" : "bridgewater corners",
"raw_name" : "BRIDGEWATER CORNERS"
},
{ "language" : "ENG",
"name" : "bridgewater center",
"raw_name" : "Bridgewater Center"
}
],
"id" : 2,
"streetNames" : [ { "language" : "ENG",
"name" : "valley view rd",
"raw_name" : "Valley View Rd"
} ]
}'
curl -XPOST 'http://127.0.0.1:9200/geoindex_test/STREET?pretty=1' -d '
{ "cityNames" : [ { "language" : "ENG",
"name" : "bridgewater",
"raw_name" : "Bridgewater"
},
{ "language" : "ENG",
"name" : "windsor",
"raw_name" : "Windsor"
}
],
"id" : 3,
"streetNames" : [ { "language" : "ENG",
"name" : "valley view rd",
"raw_name" : "Valley View Rd"
} ]
}'
我按如下方式进行搜索:
curl -XGET 'http://127.0.0.1:9200/geoindex_test/STREET/_search?pretty=1' -d '
{
"query" : {
"match" : { "cityNames.name" : "bridgewater" }
}
}'
我认为ElasticSearch会返回第三条记录(id == 3)作为最佳匹配(记录3是唯一与“bridgewater”匹配的完全匹配),而是返回id 1(w bridgewater)的记录作为最佳匹配。我做错了什么?
答案 0 :(得分:1)
我想这种情况正在发生,因为您正在使用内部对象,它基本上将其下的对象折叠成一个用于搜索目的。例如,当您查询对象1的搜索字段时,您正在查询[“w bridgewater”,“west bridgewater”]而不是您想象的离散字段。
由于'bridgewater'在对象1和2(两个名称字段)中出现两次而在对象3中出现一次,因此这些项目在搜索中排名较高。对象1最终被选中,因为'bridgewater'出现的字段比对象2中的字符串短(“w bridgewater”对比“bridgewater corner”)。
不使用像你一样的内部对象,而是使用嵌套对象http://www.elasticsearch.org/guide/reference/mapping/nested-type/。将分数模式设置为“max”将使事情以更直观的方式与您匹配。