查询弹性搜索(如sql“having”)

时间:2015-01-16 10:59:03

标签: elasticsearch

这是一张人和他们说的语言表。 我需要让只知道一种语言的人(隐藏), 用于测试(eng和ger) 我想得到(隐藏3)和(隐藏4)

PUT test/huml/1 
{"hid":1,"lang":"eng"}
PUT test/huml/2 
{"hid":1,"lang":"ger"}
PUT test/huml/3 
{"hid":1,"lang":"fr"}
PUT test/huml/4 
{"hid":2,"lang":"eng"}
PUT test/huml/5 
{"hid":2,"lang":"jap"}
PUT test/huml/6 
{"hid":3,"lang":"eng"}
PUT test/huml/7 
{"hid":4,"lang":"ger"}
PUT test/huml/8 
{"hid":5,"lang":"eng"}
PUT test/huml/9
{"hid":5,"lang":"ger"}
PUT test/huml/10 
{"hid":6,"lang":"eng"}
PUT test/huml/111 
{"hid":6,"lang":"jap"}

在oracle sql中这样做会这样做:

with
  t as (
  select 1 hid, 'eng' l from dual union all
  select 1, 'ger' from dual union all
  select 1, 'fr' from dual union all      
  select 2, 'eng' from dual union all
  select 2, 'jap' from dual union all
  select 3, 'eng' from dual union all
  select 4, 'ger' from dual union all
  select 5, 'eng' from dual union all
  select 5, 'ger' from dual union all
  select 6, 'eng' from dual union all
  select 6, 'jap' from dual 
)
 select hid,max(l)
 from t
 group by hid,l
 having count (distinct case when l  in ('eng','ger') then l end) = 1
    and count(1) =1 

1 个答案:

答案 0 :(得分:1)

我认为有一种方法可以直接执行您所要求的工作(但请参阅GitHub问题herehere)。你可能会和scripted metric aggregation一起破解某些东西,虽然这也不是很理想(而且我假设它不会很好地扩展,尽管我没有尝试过)

根据您发布的内容,您可以轻松找到多少语言的用户:

POST /test_index/_search?search_type=count
{
    "aggs": {
        "humans": {
            "terms": { "field": "hid" },
            "aggs": {
                "num_of_langs": {
                    "value_count": { "field": "lang" }
                }
            }
        }
    }
}

但这似乎并不是你所要求的。

但是,如果您稍微修改了架构,则可以使用boolhas_child过滤器的组合来解决问题(或多或少)。这是一种方式。

我拿了你发布的文档,并提取了#34; parent"每个"hid"的对象。我使用了一个映射来建立父子关系,然后批量索引文档:

DELETE /test_index

PUT /test_index
{
   "settings": {
      "number_of_shards": 1,
      "number_of_replicas": 0
   },
   "mappings": {
      "human": {
         "properties": {
            "hid": { "type": "long" }
         }
      },
      "has_lang": {
         "_parent": { "type": "human" },
         "properties": {
            "hid": { "type": "long" },
            "lang": { "type": "string" }
         }
      }
   }
}

PUT /test_index/_bulk
{"index":{"_index":"test_index", "_type":"human", "_id":1}}
{"hid":1}
{"index":{"_index":"test_index", "_type":"human", "_id":2}}
{"hid":2}
{"index":{"_index":"test_index", "_type":"human", "_id":3}}
{"hid":3}
{"index":{"_index":"test_index", "_type":"human", "_id":4}}
{"hid":4}
{"index":{"_index":"test_index", "_type":"human", "_id":5}}
{"hid":5}
{"index":{"_index":"test_index", "_type":"human", "_id":6}}
{"hid":6}

PUT /test_index/_bulk
{"index":{"_index":"test_index", "_type":"has_lang", "_parent":1, "_id":1}}
{"hid":1,"lang":"eng"}
{"index":{"_index":"test_index", "_type":"has_lang", "_parent":1, "_id":2}}
{"hid":1,"lang":"ger"}
{"index":{"_index":"test_index", "_type":"has_lang", "_parent":1, "_id":3}}
{"hid":1,"lang":"fr"}
{"index":{"_index":"test_index", "_type":"has_lang", "_parent":2, "_id":4}}
{"hid":2,"lang":"eng"}
{"index":{"_index":"test_index", "_type":"has_lang", "_parent":2, "_id":5}}
{"hid":2,"lang":"jap"}
{"index":{"_index":"test_index", "_type":"has_lang", "_parent":3, "_id":6}}
{"hid":3,"lang":"eng"}
{"index":{"_index":"test_index", "_type":"has_lang", "_parent":4, "_id":7}}
{"hid":4,"lang":"ger"}
{"index":{"_index":"test_index", "_type":"has_lang", "_parent":5, "_id":8}}
{"hid":5,"lang":"eng"}
{"index":{"_index":"test_index", "_type":"has_lang", "_parent":5, "_id":9}}
{"hid":5,"lang":"ger"}
{"index":{"_index":"test_index", "_type":"has_lang", "_parent":6, "_id":10}}
{"hid":6,"lang":"eng"}
{"index":{"_index":"test_index", "_type":"has_lang", "_parent":6, "_id":11}}
{"hid":6,"lang":"jap"}

然后我可以查询说某种语言的人,但不能查询其他语言,如下所示:

POST /test_index/human/_search
{
   "filter": {
      "bool": {
         "must": [
            {
               "has_child": {
                  "type": "has_lang",
                  "filter": { "term": { "lang": "ger" } }
               }
            }
         ],
         "must_not": [
            {
               "has_child": {
                  "type": "has_lang",
                  "filter": {
                     "not": {
                        "filter": {  "term": { "lang": "ger" } }
                     }
                  }
               }
            }
         ]
      }
   }
}
...
{
   "took": 4,
   "timed_out": false,
   "_shards": {
      "total": 1,
      "successful": 1,
      "failed": 0
   },
   "hits": {
      "total": 1,
      "max_score": 1,
      "hits": [
         {
            "_index": "test_index",
            "_type": "human",
            "_id": "4",
            "_score": 1,
            "_source": {
               "hid": 4
            }
         }
      ]
   }
}

你仍然必须为每种语言都这样做,所以这种方法可能并不理想,但希望它会让你更接近。

我也尝试使用聚合来获得您想要的答案,但从未找到一种方法来使其工作。如果/当减速器聚合实现时,如果我理解了这个想法,那么这可能会解决这类问题。

以下是我使用的代码:

http://sense.qbox.io/gist/0615ec52346ae6e547988b156b221484dbfde50c