ElasticSearch:排除某些值符合条件的结果

时间:2015-07-28 15:18:51

标签: elasticsearch filtering

我在弹性搜索中有一些像这样的文件

{"id":1000, "user":"A", "type":["foo","bar"]}
{"id":1001, "user":"B", "type":["bar"]}
{"id":1002, "user":"C", "type":["foo"]}
{"id":1003, "user":"A", "type":[]}
{"id":1004, "user":"D", "type":["foo","bar"]}
{"id":1005, "user":"E", "type":[]}
{"id":1006, "user":"F", "type":["bar"]}

我需要过滤字段中没有的用户输入值“foo”,因此预期结果必须是:

{"id":1001, "user":"B", "type":["bar"]}
{"id":1005, "user":"E", "type":[]}
{"id":1006, "user":"F", "type":["bar"]}

我试过这个查询

{
  "query": {
    "bool": {
      "must_not": [
        {
          "query_string": {
            "default_field": "type",
            "query": "foo"
          }
        }
      ]
    }
  },
  "from": 0,
  "size": 10
}

但在结果中我看到“用户”:“A”是因为文档在“类型”中具有值[]

{"id":1003, "user":"A", "type":[]}

但是“user”:“A”在“type”

中也有一个带“foo”的文档
{"id":1000, "user":"A", "type":["foo","bar"]}

那么有办法排除这些用户吗?

如果用户在其任何文档中具有值“foo”,则无法在结果中返回此用户。

1 个答案:

答案 0 :(得分:0)

I don't think you can do what you are asking with the way you have your index set up. But if you can reorganize your index to take advantage of the parent/child relationship, it can probably solve your problem.

Here is an example. I set up an index with two types, a parent type and a child type, as follows:

PUT /test_index
{
   "mappings": {
      "parent_doc": {
         "properties": {
            "user_name": {
               "type": "string"
            }
         }
      },
      "child_doc":{
          "_parent": {
             "type": "parent_doc"
          },
          "properties": {
              "type_names": {
                  "type": "string"
              }
          }
      }
   }
}

Then I took the data you posted and reorganized it like this (for empty lists I just didn't add a child doc):

POST /test_index/_bulk
{"index":{"_type":"parent_doc","_id":1}}
{"user_name":"A"}
{"index":{"_type":"child_doc","_parent":1}}
{"type_names":["foo","bar"]}
{"index":{"_type":"parent_doc","_id":2}}
{"user_name":"B"}
{"index":{"_type":"child_doc","_parent":2}}
{"type_names":["bar"]}
{"index":{"_type":"parent_doc","_id":3}}
{"user_name":"C"}
{"index":{"_type":"child_doc","_parent":3}}
{"type_names":["foo"]}
{"index":{"_type":"parent_doc","_id":4}}
{"user_name":"D"}
{"index":{"_type":"child_doc","_parent":4}}
{"type_names":["foo","bar"]}
{"index":{"_type":"parent_doc","_id":5}}
{"user_name":"E"}
{"index":{"_type":"parent_doc","_id":6}}
{"user_name":"F"}
{"index":{"_type":"child_doc","_parent":6}}
{"type_names":["bar"]}

The I can query all the users that do not have a child containing the term "foo" as follows:

POST /test_index/parent_doc/_search
{
   "filter": {
      "not": {
         "filter": {
            "has_child": {
               "type": "child_doc",
               "query": {
                  "match": {
                     "type_names": "foo"
                  }
               }
            }
         }
      }
   }
}

which returns:

{
   "took": 69,
   "timed_out": false,
   "_shards": {
      "total": 1,
      "successful": 1,
      "failed": 0
   },
   "hits": {
      "total": 3,
      "max_score": 1,
      "hits": [
         {
            "_index": "test_index",
            "_type": "parent_doc",
            "_id": "2",
            "_score": 1,
            "_source": {
               "user_name": "B"
            }
         },
         {
            "_index": "test_index",
            "_type": "parent_doc",
            "_id": "5",
            "_score": 1,
            "_source": {
               "user_name": "E"
            }
         },
         {
            "_index": "test_index",
            "_type": "parent_doc",
            "_id": "6",
            "_score": 1,
            "_source": {
               "user_name": "F"
            }
         }
      ]
   }
}

Here is the code I used:

http://sense.qbox.io/gist/bd2f4336b650c27013fdc2c64b8c1f649af1814e