我正在考虑从solr切换到elasticsearch并将一堆文档编入索引而不提供架构/映射,并且我之前在solr中设置为索引字符串的许多字段已被设置为使用text的keyword和multi-fields字段。
使用keyword将text字段作为multi-fields字段是否有任何好处?在我的情况下,字段中的大多数值都是单个单词,所以我想如果将它们发送到分析器并不重要,但是es docs似乎暗示在搜索时至少没有考虑keyword个字段不同?
如果我搜索术语“ipad”,如果在关键字字段中有“ipad”以及其他文本字段与没有关键字字段的同一文档相比,文档得分会更高?如果说“ipad”仅在关键字字段中,文档是否仍会匹配?
答案 0 :(得分:3)
为了回答我自己的问题,我创建了一个快速测试,几乎关键字和文本字段在搜索时是等效的,多字段似乎得到与其主要类型相同的分数,所以我猜第二个字段对搜索评分没有影响
奇怪的是,关键字和文本字段中的多字词值都得到了相同的分数,我希望关键字字段得分较低或者根本没有得分,但对于我的目的来说这很好,所以我不打算调查它进一步。
PUT test_index
{
"settings" : {
"number_of_shards" : 1
},
"mappings" : {
"test_type" : {
"properties" : {
"multifield": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"keywordfield": {
"type": "keyword"
},
"textfield": {
"type": "text"
}
}
}
}
}
POST /_bulk
{ "update": { "_index": "test_index", "_type": "test_type", "_id": 1 }
{ "doc" : { "multifield" : "ipad" }, "doc_as_upsert" : true }
{ "update": { "_index": "test_index", "_type": "test_type", "_id": 2 }
{ "doc" : { "keywordfield" : "ipad" }, "doc_as_upsert" : true }
{ "update": { "_index": "test_index", "_type": "test_type", "_id": 3 }
{ "doc" : { "keywordfield" : "a green ipad" }, "doc_as_upsert" : true }
{ "update": { "_index": "test_index", "_type": "test_type", "_id": 4 }
{ "doc" : { "textfield" : "a yellow ipad" }, "doc_as_upsert" : true }
{ "update": { "_index": "test_index", "_type": "test_type", "_id": 5 }
{ "doc" : { "keywordfield" : "ipad", "textfield" : "ipad" }, "doc_as_upsert" : true }
{ "update": { "_index": "test_index", "_type": "test_type", "_id": 6 }
{ "doc" : { "keywordfield" : "unrelated", "textfield" : "hopefully this wont show up" }, "doc_as_upsert" : true }
{ "update": { "_index": "test_index", "_type": "test_type", "_id": 7 }
{ "doc" : { "textfield" : "ipad" }, "doc_as_upsert" : true }
GET /test_index/_search?q=ipad
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 6,
"max_score": 0.28122374,
"hits": [
{
"_index": "test_index",
"_type": "test_type",
"_id": "5",
"_score": 0.28122374,
"_source": {
"keywordfield": "ipad",
"textfield": "ipad"
}
},
{
"_index": "test_index",
"_type": "test_type",
"_id": "1",
"_score": 0.2734406,
"_source": {
"multifield": "ipad"
}
},
{
"_index": "test_index",
"_type": "test_type",
"_id": "2",
"_score": 0.2734406,
"_source": {
"keywordfield": "ipad"
}
},
{
"_index": "test_index",
"_type": "test_type",
"_id": "7",
"_score": 0.2734406,
"_source": {
"textfield": "ipad"
}
},
{
"_index": "test_index",
"_type": "test_type",
"_id": "3",
"_score": 0.16417998,
"_source": {
"keywordfield": "a green ipad"
}
},
{
"_index": "test_index",
"_type": "test_type",
"_id": "4",
"_score": 0.16417998,
"_source": {
"textfield": "a yellow ipad"
}
}
]
}
}