Elasticsearch数组值计数聚合

时间:2016-09-20 09:43:11

标签: elasticsearch elasticsearch-2.0 elasticsearch-aggregation

示例文档

{
    "id": "62655",
    "attributes": [
        {
            "name": "genre",
            "value": "comedy"
        },
        {
            "name": "year",
            "value": "2016"
        }
    ]
}

{
    "id": "62656",
    "attributes": [
        {
            "name": "genre",
            "value": "horror"
        },
        {
            "name": "year",
            "value": "2016"
        }
    ]
}

{
    "id": "62657",
    "attributes": [
        {
            "name": "language",
            "value": "english"
        },
        {
            "name": "year",
            "value": "2015"
        }
    ]
}

预期输出

{
    "hits" : {
        "total": 3,
        "hits": []
    },
    "aggregations": {
        "attribCount": {
            "language": 1,
            "genre": 2,
            "year": 3
        },
        "attribVals": {
            "language": {
                "english": 1
            },
            "genre": {
                "comedy": 1,
                "horror": 1
            },
            "year": {
                "2016": 2,
                "2015": 1
            }
        }
    }
}

我的查询

我可以使用以下查询获取“ attribCount ”聚合。但我不知道如何计算每个属性值。

{
    "query": {
        "filtered": {
            "query": {
                "match_all": {}
            }
        }
    },
    "aggs": {
        "attribCount": {
            "terms": {
                "field": "attributes.name",
                "size": 0
            }
        }
    },
    "size": 0
}

当我使用attributes.value进行聚合时,它会给出整体计数。但我需要在预期输出中给出的名称值下列出它。

1 个答案:

答案 0 :(得分:2)

正如您所说,属性字段是嵌套的。 试试这个,这将有效

 import pandas as pd
import numpy as np
import tkinter as tk
from tkinter import messagebox

def create_file():

try:
    FILEPATH = e0.get()
    w_filename = e1.get()
    x_filename = e2.get()
    y_filename = e3.get()
    z_inventory_filename = e4.get()
    aa_active_filename = e5.get()
    ab_test_filename = e6.get()
    output_filename = e7.get()

w_sheetname = e1_sheet.get()
x_sheetname = e2_sheet.get()
y_sheetname = e3_sheet.get()
z_sheetname = e4_sheet.get()
aa_sheetname = e5_sheet.get()
ab_test_sheetname = e6_sheet.get()
except:
    messagebox.showinfo("Error", "Please fill out all fields.")

try:
    w= pd.read_excel(FILEPATH +"\\"+ w_filename, sheetname=w_sheetname, header=0)
    x = pd.read_excel(FILEPATH +"\\"+ x_filename, sheetname=x_sheetname, header=0)
    y = pd.read_excel(FILEPATH +"\\"+y_filename, sheetname=y_sheetname, header=0)
    z_inventory = pd.read_excel(FILEPATH +"\\"+  z_inventory_filename, sheetname=z_inventory_sheetname, header=0)
    aa_active = pd.read_excel(FILEPATH +"\\"+ aa_active_filename, sheetname=aa_active_sheetname, header=0)
    ab_test_ready = pd.read_excel(FILEPATH +"\\"+ ab_test_filename, sheetname=ab_test_sheetname, header=0)
except:
    messagebox.showinfo("Error", "You have mistyped either a filename or a sheetname.")