我索引的结构:
[
{
"Id":"1",
"Path":"/Series/Current/SerieA/foo/foo",
"PlayCount":100
},
{
"Id":"2",
"Path":"/Series/Current/SerieA/bar/foo",
"PlayCount":1000
},
{
"Id":"3",
"Path":"/Series/Current/SerieA/bar/bar",
"PlayCount":50
},
{
"Id":"4",
"Path":"/Series/Current/SerieB/bla/bla",
"PlayCount":300
},
{
"Id":"5",
"Path":"/Series/Current/SerieB/goo/boo",
"PlayCount":200
},
{
"Id":"6",
"Path":"/Series/Current/SerieC/foo/zoo",
"PlayCount":100
}
]
我想执行一个聚合,为每个系列带来“PlayCount”的总和,如:
[
{
"key":"serieA",
"TotalPlayCount":1150
},
{
"key":"serieB",
"TotalPlayCount":500
},
{
"key":"serieC",
"TotalPlayCount":100
}
]
这是我尝试这样做但显然查询失败,因为这不是正确的方法:
{
"size": 0,
"query":{
"filtered":{
"query":{
"regexp":{
"Path":"/Series/Current/.*"
}
}
}
},
"aggs":{
"play_count_for_current_series":{
"terms": {
"field": "Path",
"regexp": "/Series/Current/([^/]+)"
},
"aggs":{
"Total_play": { "sum": { "field": "PlayCount" } }
}
}
}
}
有办法吗?
答案 0 :(得分:0)
我的建议如下:
DELETE test
PUT /test
{
"settings": {
"analysis": {
"filter": {
"my_special_filter": {
"type": "pattern_capture",
"preserve_original": 0,
"patterns": [
"/Series/Current/([^/]+)"
]
}
},
"analyzer": {
"my_special_analyzer": {
"tokenizer": "whitespace",
"filter": [
"my_special_filter"
]
}
}
}
},
"mappings": {
"test": {
"properties": {
"Path": {
"type": "string",
"fields": {
"for_aggregations": {
"type": "string",
"analyzer": "my_special_analyzer"
},
"raw": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
}
}
创建一个使用pattern_capture
过滤器的特殊分析器,仅捕获您感兴趣的那些术语。因为我不想更改该字段的当前映射,所以我添加了fields
部分,其中包含将使用此特殊分析器的子字段。我还添加了raw
字段是not_analyzed
,它将有助于查询本身。
POST test/test/_bulk
{"index":{}}
{"Id":"1","Path":"/Series/Current/SerieA/foo/foo","PlayCount":100}
{"index":{}}
{"Id":"2","Path":"/Series/Current/SerieA/bar/foo","PlayCount":1000}
{"index":{}}
{"Id":"3","Path":"/Series/Current/SerieA/bar/bar","PlayCount":50}
{"index":{}}
{"Id":"4","Path":"/Series/Current/SerieB/bla/bla","PlayCount":300}
{"index":{}}
{"Id":"5","Path":"/Series/Current/SerieB/goo/boo","PlayCount":200}
{"index":{}}
{"Id":"6","Path":"/Series/Current/SerieC/foo/zoo","PlayCount":100}
{"index":{}}
{"Id":"7","Path":"/Sersdasdies/Curradent/SerieC/foo/zoo","PlayCount":100}
对于查询,您不需要查询中的正则表达式,因为您的聚合将使用仅包含所需SerieX
项的子字段。
GET /test/test/_search
{
"size": 0,
"query": {
"filtered": {
"query": {
"regexp": {
"Path.raw": "/Series/Current/.*"
}
}
}
},
"aggs": {
"play_count_for_current_series": {
"terms": {
"field": "Path.for_aggregations"
},
"aggs": {
"Total_play": {
"sum": {
"field": "PlayCount"
}
}
}
}
}
}
结果是
"play_count_for_current_series": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "SerieA",
"doc_count": 3,
"Total_play": {
"value": 1150
}
},
{
"key": "SerieB",
"doc_count": 2,
"Total_play": {
"value": 500
}
},
{
"key": "SerieC",
"doc_count": 1,
"Total_play": {
"value": 100
}
}
]
}