难以对数据集运行基本的Elasticsearch查询

时间:2019-03-28 17:29:59

标签: elasticsearch

我正在尝试对具有基本莎士比亚数据集的弹性搜索集群进行布尔查询。我已经核对了很多资源,并且一切似乎都正确,但是当我运行它时,speech_number得分OR操作无法按预期进行。

我浏览了有关Elasticsearch布尔查询的各种教程和文档,但我仍然没有发现为什么逻辑无法按预期工作。

"query": {
            "bool": {
                "must": [
                    {
                        "match": {"play_name": "Henry IV"}
                    },
                    {
                        "bool": {
                            "should": [
                                {"range": {"speech_number": {"lte": 50}}},
                                {"range": {"speech_number": {"gte": 4}}}
                            ]
                        }
                    }
                ]
            }
        }
    }

我对其运行查询的shakespear.json文件的示例如下:

{"line_id":1658,"play_name":"Henry IV","speech_number":26,"line_number":"3.1.108","speaker":"MORTIMER","text_entry":"Yea, but"}
{"index":{"_index":"shakespeare","_type":"line","_id":1658}}
{"line_id":1659,"play_name":"Henry IV","speech_number":26,"line_number":"3.1.109","speaker":"MORTIMER","text_entry":"Mark how he bears his course, and runs me up"}
{"index":{"_index":"shakespeare","_type":"line","_id":1659}}
{"line_id":1660,"play_name":"Henry IV","speech_number":26,"line_number":"3.1.110","speaker":"MORTIMER","text_entry":"With like advantage on the other side;"}
{"index":{"_index":"shakespeare","_type":"line","_id":1660}}
{"line_id":1661,"play_name":"Henry IV","speech_number":26,"line_number":"3.1.111","speaker":"MORTIMER","text_entry":"Gelding the opposed continent as much"}
{"index":{"_index":"shakespeare","_type":"line","_id":1661}}
{"line_id":1662,"play_name":"Henry IV","speech_number":26,"line_number":"3.1.112","speaker":"MORTIMER","text_entry":"As on the other side it takes from you."}
{"index":{"_index":"shakespeare","_type":"line","_id":1662}}
{"line_id":1663,"play_name":"Henry IV","speech_number":27,"line_number":"3.1.113","speaker":"EARL OF WORCESTER","text_entry":"Yea, but a little charge will trench him here"}
{"index":{"_index":"shakespeare","_type":"line","_id":1663}}
{"line_id":1664,"play_name":"Henry IV","speech_number":27,"line_number":"3.1.114","speaker":"EARL OF WORCESTER","text_entry":"And on this north side win this cape of land;"}
{"index":{"_index":"shakespeare","_type":"line","_id":1664}}
{"line_id":1665,"play_name":"Henry IV","speech_number":27,"line_number":"3.1.115","speaker":"EARL OF WORCESTER","text_entry":"And then he runs straight and even."}
{"index":{"_index":"shakespeare","_type":"line","_id":1665}}
{"line_id":1666,"play_name":"Henry IV","speech_number":28,"line_number":"3.1.116","speaker":"HOTSPUR","text_entry":"Ill have it so: a little charge will do it."}
{"index":{"_index":"shakespeare","_type":"line","_id":1666}}
{"line_id":1667,"play_name":"Henry IV","speech_number":29,"line_number":"3.1.117","speaker":"GLENDOWER","text_entry":"Ill not have it alterd."}
{"index":{"_index":"shakespeare","_type":"line","_id":1667}}
{"line_id":1668,"play_name":"Henry IV","speech_number":30,"line_number":"3.1.118","speaker":"HOTSPUR","text_entry":"Will not you?"}
{"index":{"_index":"shakespeare","_type":"line","_id":1668}}
{"line_id":1669,"play_name":"Henry IV","speech_number":31,"line_number":"3.1.119","speaker":"GLENDOWER","text_entry":"No, nor you shall not."}
{"index":{"_index":"shakespeare","_type":"line","_id":1669}}
{"line_id":1670,"play_name":"Henry IV","speech_number":32,"line_number":"3.1.120","speaker":"HOTSPUR","text_entry":"Who shall say me nay?"}
{"index":{"_index":"shakespeare","_type":"line","_id":1670}}
{"line_id":1671,"play_name":"Henry IV","speech_number":33,"line_number":"3.1.121","speaker":"GLENDOWER","text_entry":"Why, that will I."}

预期结果应为:play_name AND(语音编号<= 50或语音编号> = 4) 我得到的是:play_name AND(语音编号<= 50 AND语音编号> = 4)

1 个答案:

答案 0 :(得分:0)

您是正确的,查询正在执行:

  • 必须匹配: 1)“ Henry IV”中的任何单词[1见下文] 2)A speech_number <= 50 [OR] speech_number> = 4

Elasticsearch也在做一个评分:所以must内部的所有内容都应该匹配,然后should查询内部的所有内容都可以提升结果(至少should个具有匹配)

要进一步提高speech_number不要这样做),您可以使用更多匹配的should语句:

{
    "query": {
        "bool": {
            "must": [
                {
                    "match": { 
                        "play_name": "Henry IV"
                    }
                },

                {
                    "bool": {
                        "should": [
                            {
                                "range": {
                                    "speech_number": { "lte": 50 }
                                }
                            },

                            {
                                "range": {
                                    "speech_number": { "lte": 40 }
                                }
                            },

                            {
                                "range": {
                                    "speech_number": { "lte": 30 }
                                }
                            },

                            ...
                        ]
                    }
                }
            ]
        }
    }
}

因此,部分问题可能是lte: 50允许<4,而gte:4> 50。但是我看不到任何局外人。如果是订购。 range还可以增强(https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-range-query.html),因此不必写多个范围,您可以:

{
    "query": {
        "bool": {
            "must": [
                {
                    "match": { 
                        "play_name": "Henry IV",
                        "operator": "and
                    }
                },

                {
                    "bool": {
                        "should": [
                            {
                                "range": {
                                    "speech_number": { 
                                        "gte": 25,
                                        "lte": 50,

                                        "boost": 3
                                    }
                                }
                            },

                            {
                                "range": {
                                    "speech_number": { 
                                        "gte": 4,
                                        "lte": 50
                                    }
                                }
                            }
                        ]
                    }
                }
            ]
        }
    }
}

[1 *]匹配默认为OR:https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-match-query.html(如果您的数据是结构化的),则术语或向其添加运算符and更为您所需要。不是您的问题的问题:)