在ElasticSearch中的多个字段上进行单词和短语搜索

时间:2020-03-25 18:35:45

标签: elasticsearch elasticsearch-dsl elasticsearch-query

我想通过ElasticSearch使用Python搜索文档。我正在寻找三个字段中任何一个包含单词和/或短语的文档。

GET /my_docs/_search
{
  "query": {
    "multi_match": {
      "query": "Ford \"lone star\"",
      "fields": [
        "title",
        "description",
        "news_content"
      ],
      "minimum_should_match": "-1",
      "operator": "AND"
    }
  }
}

在上面的查询中,我想获取标题,描述或news_content包含“福特”和“孤星”(作为短语)的文档。

但是,似乎它没有将“孤星”视为一个短语。它返回带有“ Ford”,“ lone”和“ star”的文档。

1 个答案:

答案 0 :(得分:1)

因此,我能够重现您的问题并使用Elasticsearch的REST API来解决,因为我不熟悉python语法,并且很高兴您以JSON格式提供了搜索查询,因此我在此基础上构建了解决方案

索引定义

{
    "mappings": {
        "properties": {
            "title": {
                "type": "text"
            },
            "description" :{
                "type" : "text"
            },
            "news_content" : {
                "type" : "text"
            }
        }
    }
}

示例文档

{
  "title" : "Ford",
  "news_content" : "lone star", --> note this matches your criteria
  "description" : "foo bar"
}

{
  "title" : "Ford",
  "news_content" : "lone",
  "description" : "star"
}

您要搜索的搜索查询

{
    "query": {
        "bool": {
            "must": [ --> note this, both clause must match
                {
                    "multi_match": {
                        "query": "ford",
                        "fields": [
                            "title",
                            "description",
                            "news_content"
                        ]
                    }
                },
                {
                    "multi_match": {
                        "query": "lone star",
                        "fields": [
                            "title",
                            "description",
                            "news_content"
                        ],
                        "type": "phrase" --> note `lone star` must be phrase
                    }
                }
            ]
        }
    }
}

结果仅包含样本中的一个文档

"hits": [
      {
        "_index": "so_phrase",
        "_type": "_doc",
        "_id": "1",
        "_score": 0.9527341,
        "_source": {
          "title": "Ford",
          "news_content": "lone star",
          "description": "foo bar"
        }
      }
    ]