Elasticsearch嵌套聚合

时间:2015-02-17 08:51:02

标签: java elasticsearch count aggregation

关于架构的几句话,我有一种类型的文档(评论),其中包含评论列表(嵌套对象)每个评论都有以下字段:极性(负面或相反),关键字(评论的主要词),审稿人。 我的目标是找到热门的正面和正面关键字,并为每个关键字找到相反的计数(如果关键字位于最高位,我需要找到关键字的负数)

例如

(基于下面提供的数据)

  • 顶级否定
    • iphone - 2
      • 相反计数(正面) - 2
    • 三星 - 1
      • 相反计数(正数) - 0
  • 顶级正面
    • iphone - 2
      • 相反数(负数) - 2

提前感谢您的时间。

架构:

curl -XPOST "http://localhost:9200/forum_poc" -d 
{
  "settings": {
    "number_of_shards": 9,
    "number_of_replicas": 1
  },
  "mappings": {
    "_default_": {
      "_all": {
        "enabled": false
      },
      "_source": {
        "enabled": true
      },
      "dynamic": "false"
    },
    "ReviewEvent": {
      "_source": {
        "enabled": true
      },
      "properties": {
        "Reviews": {
          "type": "nested",
          "include_in_parent": true,
          "properties": {
            "polarity": {
              "type": "string",
              "index": "not_analyzed",
              "store": "true"
            },
            "reviewer": {
              "type": "string",
              "index": "not_analyzed",
              "store": "true"
            },
            "keyword": {
              "type": "string",
              "index": "not_analyzed",
              "store": "true"
            }
          }
        }
      }
    }
  }
}

数据:

curl -XPOST "http://localhost:9200/forum_poc" -d 
{"index":{"_index":"forum_poc","_type":"ReviewEvent","_id":0}}
{"Reviews":[{"polarity":"negative","reviewer":"jhon","keyword":"iphone"},{"polarity":"negative","reviewer":"kevin","keyword":"samsung"}]}
{"index":{"_index":"forum_poc","_type":"ReviewEvent","_id":1}}
{"Reviews":[{"polarity":"positive","reviewer":"Doron","keyword":"iphone"}]}
{"index":{"_index":"forum_poc","_type":"ReviewEvent","_id":2}}
{"Reviews":[{"polarity":"negative","reviewer":"Michel","keyword":"iphone"}]}
{"index":{"_index":"forum_poc","_type":"ReviewEvent","_id":4}}
{"Reviews":[{"polarity":"positive","reviewer":"Afi","keyword":"iphone"}]}

我的查询:

POST forum_poc/_search?search_type=count
{
  "aggs": {
    "aggregation": {
      "nested": {
        "path": "Reviews"
      },
      "aggs": {
        "polarity": {
          "terms": {
            "field": "polarity",
            "size": 10
          },
          "aggs": {
            "keyword": {
              "terms": {
                "field": "keyword",
                "size": 10
              }
            }
          }
        }
      }
    }
  }
}

我需要为每个关键字提供相反的计数。

{
   "took": 7,
   "timed_out": false,
   "_shards": {
      "total": 9,
      "successful": 9,
      "failed": 0
   },
   "hits": {
      "total": 4,
      "max_score": 0,
      "hits": []
   },
   "aggregations": {
      "aggregation": {
         "doc_count": 5,
         "polarity": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
               {
                  "key": "negative",
                  "doc_count": 3,
                  "keyword": {
                     "doc_count_error_upper_bound": 0,
                     "sum_other_doc_count": 0,
                     "buckets": [
                        {
                           "key": "iphone",
                           "doc_count": 2
                        },
                        {
                           "key": "samsung",
                           "doc_count": 1
                        }
                     ]
                  }
               },
               {
                  "key": "positive",
                  "doc_count": 2,
                  "keyword": {
                     "doc_count_error_upper_bound": 0,
                     "sum_other_doc_count": 0,
                     "buckets": [
                        {
                           "key": "iphone",
                           "doc_count": 2
                        }
                     ]
                  }
               }
            ]
         }
      }
   }

1 个答案:

答案 0 :(得分:1)

为什么不进行聚合交换聚合级别。关键字上的第一个聚合然后是极性 -

POST forum_poc/_search?search_type=count
{
  "aggs": {
    "aggregation": {
      "nested": {
        "path": "Reviews"
      },
      "aggs": {
        "polarity": {
          "terms": {
            "field": "keyword",
            "size": 10
          },
          "aggs": {
            "keyword": {
              "terms": {
                "field": "polarity",
                "size": 10
              }
            }
          }
        }
      }
    }
  }
}