Question

我有1,000,000个来自不同群体的联系人。例如

{"gps_id": [{"gid": "G1"},{"gid": "G2"}],"is_active": true,"contact": "c1"}
{"gps_id": [{"gid": "G2"}],"is_active": true,"contact": "c2"}
....
{"gps_id": [{"gid": "G1"},{"gid": "G2"}],"is_active": true,"contact": "c1000000"}

考虑到G1拥有500,000个联系人，G2拥有1,000,000个联系人，其中有500,000个联系人已经存在于G1中。

我想根据条件过滤上面的文档对象， “按组ID从所有相应组中获取唯一联系人。”

我尝试了Elastic脚本查询，如下所示。但它不起作用：

{
  "query": {
    "bool": {
         "must" : {
                "script" : {
                     "script" : {
                        "inline": "for (int i = 0; i < params.gps_id.length; ++i) {ctx._source.gps_id.add(params.gps_id[i]) }",
                        "lang": "painless",
                        "params": {
                            "gps_id": [
                              {
                                "gid": "G1"
                              },
                              {
                                "gid": "G2"
                              }
                            ]
                    }
                }
                }
            },
      "must": [

        {
          "match": {
            "is_active": true
          }
        },
        {
          "nested": {
            "path": "gps_id",
            "query": {
              "bool": {
                "must": [
                  {
                    "match": {
                      "gps_id.gid": "G1"
                    }
                  }
                ]
              }
            }
          }
        }
      ]
    }
  }
}

此处群组及其联系人的规模可能会增加。

请使用Elasticsearch -5.1.2

建议实施它的最佳方法

如果存在于elasticsearch中的嵌套对象中的字段，则过滤的最佳方法是什么？

0 个答案: