Question

我正在尝试学习和编写Elasticsearch查询。我意识到，存在一个“存在”字段，该字段返回指定字段的文档是否存在。要了解我编写了一个简单的查询，并且想了解更多信息并尝试使用查询结构。

我有一个查询，可以简单地检查至少一个指定字段是否存在。但是，我想更加重视一个领域。这是我的查询：

"query": {
"bool": {
  "minimum_should_match" : 1,
  "should": [
    {
      "exists": {
        "field": "geo"
      }
    },
    {
      "exists": {
        "field": "location"
      }
    }
  ]
   "size": 100
}

我想获取所有首先具有地理位置字段的文档（例如，有30个文档包含位置字段），其余70个文档（大小-文档存在地理位置字段）将包含位置字段的文档（其他应包含）。因此，在我的案例中，位置字段权重的存在小于地理位置的存在。

我为此尝试了增强，但是当我这样做时，它对我的情况没有作用；

"query": {
"bool": {
  "minimum_should_match" : 1,
  "should": [
    {
      "exists": {
        "field": "geo",
        "boost": 5 
      }
    },
    {
      "exists": {
        "field": "location"
      }
    }
  ]
   "size": 100
}

当我将minimum_should_match更改为2时，它仅返回存在地理字段的文档。

Answer 1

在这种情况下，您不应该使用boost。使用排序代替：

"query": {
  "bool": {
    "minimum_should_match" : 1,
    "should": [
      {
        "exists": {
          "field": "geo"
        }
      },
      {
        "exists": {
          "field": "location"
        }
      }
    ]
  "size": 100
  }
},
"sort" : [
  { "geo" : {"order" : "asc"}},
  { "location" : {"order" : "asc"}}
]

通过这种方式，您可以对结果进行排序（首先是具有地理位置字段的文档，而不是具有位置字段的文档）

Answer 2

您应该尝试此查询

{
  "query": {
    "function_score": {
      "functions": [
        {
          "filter": {
            "exists": {
              "field": "geo"
            }
          },
          "weight": 2
        },
        {
          "filter": {
            "exists": {
              "field": "location"
            }
          },
          "weight": 1
        }
      ]
    }
  },
  "from": 0,
  "_source": [
    "geo", "location"
  ],
  "size": 100
}

得出以下结果；

 {
    "_index": "mentions",
    "_type": "post",
    "_id": "1",
    "_score": 2,
    "_source": {
      "geo": {
        "lon": XXX,
        "lat": XXX
      },
      "location": "California, USA"
    }
  },

{
    "_index": "mentions",
    "_type": "post",
    "_id": "2",
    "_score": 1,
    "_source": {
      "location": "Berlin, Germany"
    }
  }

第一个函数的得分为2，因为它具有地理字段，而第二个函数得分为2。

更加重视领域的存在

2 个答案: