索引和查询UUID列表

时间:2015-02-02 20:55:13

标签: performance indexing elasticsearch

我们数据中的某些字段包含UUID列表作为值。 E. g。:

{
 "name": "pupkin",
 "group": "admins",
 "assets": ["d1f84400-91b6-425c-a11b-9ba7e59930ce",
            "99478356-f6b3-49e2-8cae-f408d5a24492"],
 "action": "login",
 "children": ["2637833e-1017-4d82-bc65-951fffc09c7d",
              "c30f7c34-7a50-4031-bf74-94d413acec15",
              "cffef4ef-df9e-4079-ac2f-50bbe332e223"],
 "level": 20
}

我们对数据的大多数查询涉及检查UUID的长列表(数十个,有时是数百个,在我们扩展后可能有数千个)。列表会不时更改,因此在编写所有列表中的L的所有事件中都无法为x预先计算x∈L。

我们目前相当简单的方式。 G。获取_search?search_type=count的直方图的数据是:

 {"query":
  {"bool":
   {"must": [
    {"query_string": {"query": "user:pupkin AND (assets:d1f84400-91b6-425c-a11b-9ba7e59930ce OR assets:99478356-f6b3-49e2-8cae-f408d5a24492 OR assets:2637833e-1017-4d82-bc65-951fffc09c7d OR assets:c30f7c34-7a50-4031-bf74-94d413acec15)"}},
    {"range": {"time": {"gt": "2014-11-01T00:00:00Z", "lte": "2014-11-01T00:20:00.0001Z"}}},
   ]}},
  "aggs": {"counts": {"date_histogram": {"field": "time", "interval": "minute", "min_doc_count": 0}}}}

但它无效:60个UUID的列表会使查询速度减慢10倍。如何减少此因素?

1 个答案:

答案 0 :(得分:1)

我会尝试,而不是query_string,过滤器只利用过滤器的缓存功能,从而在后续请求时加快速度:

{
  "query": {
    "filtered": {
      "filter": {
        "bool": {
          "must": [
            {
              "terms": {
                "assets": [
                  "d1f84400-91b6-425c-a11b-9ba7e59930ce",
                  "99478356-f6b3-49e2-8cae-f408d5a24492",
                  "2637833e-1017-4d82-bc65-951fffc09c7d",
                  "c30f7c34-7a50-4031-bf74-94d413acec15"
                ]
              }
            },
            {
              "range": {
                "time": {
                  "gt": "2014-11-01T00:00:00Z",
                  "lte": "2014-11-01T00:20:00.0001Z"
                }
              }
            }
          ]
        }
      }
    }
  },
  "aggs": {
    "counts": {
      "date_histogram": {
        "field": "time",
        "interval": "minute",
        "min_doc_count": 0
      }
    }
  }
}