Question

我有这段python代码，在其中为Elasticsearch创建了映射，然后使用下面提到的搜索查询来搜索内容：

映射：

data_mapping = {

        "settings": {
            "analysis": {
                "analyzer": {
                    "es_analyzer": {
                        "tokenizer": "standard",
                        "filter": [

                            "stop_words"

                        ]
                    }
                },
                "filter": {

                    "stop_words": {
                        "type": "standard",
                        "stopwords": "_english_"
                    }
                }
            }
        },
        "mappings": {
            str(bot_name).lower(): {
                "properties": {
                    "qid": {
                        "type": "string",
                        "fields": {
                            "stemmed": {
                                "type": "string"

                            }
                        }
                    },
                    "q": {
                        "type": "array",
                        "fields": {
                            "stemmed": {
                                "type": "string"

                            }
                        }
                    },
                    "a": {
                        "type": "string",
                        "fields": {
                            "stemmed": {
                                "type": "string"

                            }
                        }
                    },
                    "votes": {
                        "type": "integer",
                        "fields": {
                            "stemmed": {
                                "type": "integer"

                            }
                        }
                    }

                }
            }
        }
    }

来自上述映射的样本数据为：

{"qid":"1","q":["what can you tell me about Google Flag","I want to know about Google Flag","tell me about Google Flag","What is Google Flag"],"a":"Google is a search engine company based out of California USA.","votes":0}

{"qid":"2","q":["How is the Google Flag used"],"a":"Google flag is used search indexing.","votes":0}

{"qid":"3","q":["How is the Google Flag maintained"],"a":"Google means to search.","votes":0}

查询：

data = {
            "query": {
                "function_score": {

                    "query": {

                        "multi_match": {
                            "type": "most_fields",
                            "query": question,
                            "fields": ["q", "English"]

                        }
                    },

                    "field_value_factor": {
                        "field": "votes",
                        "modifier": "log2p"
                    }

                }
            }
        }
        response = es.search(index=str(index_name).lower(), body=data)

在上面的查询中，我正在做的是针对映射内容中的q字段搜索一个问题。现在，当我搜索What is google flag时，理想情况下q qid的{{1}}字段应该是最高的，但是1 qid的得分最高。但是，当我搜索3（加上What is google flag?）时，? qid的得分最高。我无法理解：

为什么1 qid最初得分最高-我的猜测是TF / IDF压倒了别人。
为什么添加3会使? qid的得分最高？
对于上述第1点（搜索“什么是google flag”），我可以对映射/搜索查询进行哪些更改，使其得分最高？如何强制Elasticsearch值100％匹配更多（如果存在一对一匹配）。

Elasticsearch查询使用python产生的答案比预期的要多

0 个答案: