过滤所有必须包含所有指定值的字典数组

时间:2015-05-25 15:27:58

标签: elasticsearch

说我有这个文件:

         {
        "_index": "food",
        "_type": "recipes",
        "_id": "AU2LjsMLOuShTUj_LBrT",
        "_score": 1,
        "_source": {
           "name": "granola bars",
           "ingredients": [
              {
                 "name": "butter",
                 "quantity": 4
              },
              {
                 "name": "granola",
                 "quantity": 6
              }
              ]
           }
        }

使用以下过滤器可以很好地匹配此文档:

POST /food/recipes/_search
{
"query": {
    "filtered": {
        "query": {
            "match_all": { }
        },
        "filter": {
            "nested": {
                "path": "ingredients",
                "filter": {
                    "bool": {
                        "must": [
                            {
                                "terms": {
                                    "ingredients.name": [
                                        "butter",
                                        "granola"
                                    ]
                                }
                            }
                        ]
                    }
                }
            }
        }
    }
}
}

然而,它也会与具有其他成分的文件相匹配 我如何查询,以便它只匹配仅含有黄油和格兰诺拉麦片成分的文件?

1 个答案:

答案 0 :(得分:1)

你需要一个“双重否定”。您希望匹配具有与您的查询匹配的嵌套文档的父文档,以及与您的查询不匹配的嵌套文档。

测试我设置了以下索引:

PUT /test_index
{
   "settings": {
      "number_of_shards": 1
   },
   "mappings": {
      "doc": {
         "properties": {
            "ingredients": {
               "type": "nested",
               "properties": {
                  "name": {
                     "type": "string"
                  },
                  "quantity": {
                     "type": "long"
                  }
               }
            },
            "name": {
               "type": "string"
            }
         }
      }
   }
}

并添加了这两个文件:

PUT /test_index/doc/1
{
   "name": "granola bars",
   "ingredients": [
      {
         "name": "butter",
         "quantity": 4
      },
      {
         "name": "granola",
         "quantity": 6
      }
   ]
}

PUT /test_index/doc/2
{
   "name": "granola cookies",
   "ingredients": [
      {
         "name": "butter",
         "quantity": 5
      },
      {
         "name": "granola",
         "quantity": 7
      },
      {
         "name": "milk",
         "quantity": 2
      },
      {
         "name": "sugar",
         "quantity": 7
      }
   ]
}

您的查询将返回两个文档。出于这个问题的目的,为了使其更容易理解,我首先简化了您的查询:

POST /test_index/doc/_search
{
   "query": {
      "filtered": {
         "query": {
            "match_all": {}
         },
         "filter": {
            "nested": {
               "path": "ingredients",
               "filter": {
                  "terms": {
                     "ingredients.name": [
                        "butter",
                        "granola"
                     ]
                  }
               }
            }
         }
      }
   }
}

然后我添加了一个外部"bool"和两个"nested"过滤器。一个是您最初在"must"内部使用的过滤器,第二个是与您拥有的过滤器相反(因此它将匹配不包含这些术语的嵌套文档),在"must_not"内: / p>

POST /test_index/doc/_search
{
   "query": {
      "filtered": {
         "query": {
            "match_all": {}
         },
         "filter": {
            "bool": {
               "must": [
                  {
                     "nested": {
                        "path": "ingredients",
                        "filter": {
                           "terms": {
                              "ingredients.name": [
                                 "butter",
                                 "granola"
                              ]
                           }
                        }
                     }
                  }
               ],
               "must_not": [
                  {
                     "nested": {
                        "path": "ingredients",
                        "filter": {
                           "not": {
                              "filter": {
                                 "terms": {
                                    "ingredients.name": [
                                       "butter",
                                       "granola"
                                    ]
                                 }
                              }
                           }
                        }
                     }
                  }
               ]
            }
         }
      }
   }
}

这只返回一个doc:

{
   "took": 1,
   "timed_out": false,
   "_shards": {
      "total": 1,
      "successful": 1,
      "failed": 0
   },
   "hits": {
      "total": 1,
      "max_score": 1,
      "hits": [
         {
            "_index": "test_index",
            "_type": "doc",
            "_id": "1",
            "_score": 1,
            "_source": {
               "name": "granola bars",
               "ingredients": [
                  {
                     "name": "butter",
                     "quantity": 4
                  },
                  {
                     "name": "granola",
                     "quantity": 6
                  }
               ]
            }
         }
      ]
   }
}

以下是我用来测试它的所有代码:

http://sense.qbox.io/gist/e5fd0c35070fb329d40ad342b3198695e6f52d3a