弹性搜索结果不一致

时间:2020-10-30 12:58:05

标签: elasticsearch elasticsearch-query

我的索引包含大约40.000条船名

发布查询船只名称(即“ TUC”)时,我会得到许多结果 但是,当将查询词减少为“ T”时,我从“ TUC”查询中获得的结果不在结果集中吗?

我对造成这种情况的原因有些困惑,但是想知道是否由于总结果集太大而被删去了?

一些统计:

查询:

{
"query" : {
    "bool" : {
        "must" : [
            {
                "query_string" : {
                    "fields" : ["vesselName"],
                    "type" : "phrase_prefix",
                    "query" : "T"
                }
            }
        ]
    }
}

结果(第一):

"max_score": 12.450134,
    "hits": [
        {
            "_index": "vesselsindex",
            "_type": "_doc",
            "_id": "06ad4663-42f6-4771-b350-0d3b7a1b3229",
            "_score": 12.450134,
            "_source": {
                "vesselId": "06ad4663-42f6-4771-b350-0d3b7a1b3229",
                "callSign": "FATA",
                "vesselName": "TAAPE"
            }
        },

结果(当使用搜索字词“ TUC”时):

{
            "_index": "vesselsindex",
            "_type": "_doc",
            "_id": "e7bea95c-6819-48b1-b52e-0a8fbaeef1df",
            "_score": 11.831188,
            "_source": {
                "vesselId": "e7bea95c-6819-48b1-b52e-0a8fbaeef1df",
                "callSign": "PBAQ",
                "vesselName": "TUCANA"
            }
        },

设置:

{
"vesselsindex": {
    "settings": {
        "index": {
            "number_of_shards": "1",
            "provided_name": "vesselsindex",
            "max_result_window": "50000",
            "creation_date": "1604061335143",
            "analysis": {
                "analyzer": {
                    "keywordWithCaseIgnore": {
                        "filter": [
                            "lowercase"
                        ],
                        "type": "custom",
                        "tokenizer": "keyword"
                    }
                }
            },
            "number_of_replicas": "1",
            "uuid": "M-m3nIB5TqeiPNR2NR5zWQ",
            "version": {
                "created": "7060099"
            }
        }
    }
}

统计:

{
"_shards": {
    "total": 2,
    "successful": 1,
    "failed": 0
},
"_all": {
    "primaries": {
        "docs": {
            "count": 43510,
            "deleted": 0
        },
        "store": {
            "size_in_bytes": 12762612
        },

1 个答案:

答案 0 :(得分:0)

发生这种情况是因为默认情况下,ES仅返回前10个搜索结果,而当您搜索T时,前10个文档可能是TUC查询搜索中出现的时间结果。

如果您想获得更多的搜索结果,请增加size param,这很昂贵,因此主要用于分页以提高搜索查询的性能。

您可以将size参数作为查询参数或作为搜索请求正文的一部分来提供。

您可以在搜索请求URL中尝试_search?size = 44000,它应该返回所有搜索结果