如何在Druid io中将Post Aggregation值字段添加为Metric

时间:2016-06-06 11:12:07

标签: java druid

我正在使用德鲁伊io 0.9.0。我正在尝试添加后聚合字段作为指标规范。我的意图是显示后聚合字段的值,类似于度量(度量)的显示方式(在Druid中使用Pivot)。

我的德鲁伊io架构文件是

    {
      "dataSources" : {
        "NPS1112" : {
          "spec" : {
            "dataSchema" : {
              "dataSource" : "NPS1112",
              "parser" : {
                "type" : "string",
                "parseSpec" : {
                  "timestampSpec" : {
                    "column" : "timestamp",
                    "format" : "auto"
                  },
                  "dimensionsSpec" : {
                    "dimensions" : ["dimension1","dimension2","dimension3"],
                     "dimensionExclusions" : [
                      "timestamp",
                      "OverallRating",
                      "DeliveryTimeRating",
                      "ItemQualityRating",
                      "isPromoter",
                      "isDetractor"
                    ]
                  },
                  "format" : "json"
                }
              },
              "granularitySpec" : {
                "type" : "uniform",
                "segmentGranularity" : "hour",
                "queryGranularity" : "none"
              },
             "aggregations" : [
             { "type" : "count", "name" : "rows"},
             { "type" : "doubleSum", "name" : "CountOfPromoters", "fieldName" : "isPromoter" },
             { "type" : "doubleSum", "name" : "CountOfDetractor", "fieldName" : "isDetractor" }
            ],
            "postAggregations" : [
            { "type"   : "arithmetic",
              "name"   : "PromoterPercentage",
              "fn"     : "/",
              "fields" : [
                   { "type" : "fieldAccess", "name" : "CountOfPromoters", "fieldName" : "CountOfPromoters" },
                   { "type" : "fieldAccess", "name" : "rows", "fieldName" : "rows" }
                  ]
             },
             { "type"   : "arithmetic",
              "name"   : "DetractorPercentage",
              "fn"     : "/",
              "fields" : [
                   { "type" : "fieldAccess", "name" : "CountOfDetractor", "fieldName" : "CountOfDetractor" },
                   { "type" : "fieldAccess", "name" : "rows", "fieldName" : "rows" }
                  ]
             },
             { "type"   : "arithmetic",
              "name"   : "NPS",
              "fn"     : "-",
              "fields" : [
                   { "type" : "fieldAccess", "name" : "PromoterPercentage", "fieldName" : "PromoterPercentage" },
                   { "type" : "fieldAccess", "name" : "DetractorPercentage", "fieldName" : "DetractorPercentage" }
                  ]
             }
             ],
              "metricsSpec" : [
                {
                  "type" : "count",
                  "name" : "CountOfResponses"
                },
                {
                  "type" : "fieldAccess",
                  "name" : "CountOfPromoters"
                }
              ]
            },
            "ioConfig" : {
              "type" : "realtime"
            },
            "tuningConfig" : {
              "type" : "realtime",
              "maxRowsInMemory" : "10000",
              "intermediatePersistPeriod" : "PT10M",
              "windowPeriod" : "PT10M"
            }
          },
          "properties" : {
            "task.partitions" : "1",
            "task.replicants" : "1"
          }
        }
      },
      "properties" : {
        "zookeeper.connect" : "localhost",
        "druid.discovery.curator.path" : "/druid/discovery",
        "druid.selectors.indexing.serviceName" : "druid/overlord",
        "http.port" : "8200",
        "http.threads" : "4"
      }
    }

使用java客户端发送字段的我的代码。

          final Map<String,Object> obj = new HashMap<String, Object>();

          obj.put("timestamp", new DateTime().toString());

          obj.put("OverallRating", (ran.nextInt(high-low) + low));
          obj.put("DeliveryTimeRating", (ran.nextInt(high-low) + low));
          obj.put("ItemQualityRating", (ran.nextInt(high-low) + low));
          obj.put("isPromoter", ((ran.nextInt(high-low) + low)%2) == 0 ? 1 : 0);
          obj.put("isDetractor", ((ran.nextInt(high-low) + low)%2) == 0 ? 1 : 0);

          obj.put("dimension1", "dimension1-"+ (ran.nextInt(high-low) + low));
          obj.put("dimension2", "dimension2-"+ (ran.nextInt(high-low) + low));
          obj.put("dimension3", "dimension3-"+ (ran.nextInt(high-low) + low));

任何人都可以指出我的错误。

1 个答案:

答案 0 :(得分:0)

我不知道你是否可以在你的摄取规范中做到这一点(我真的想知道我们是否可以!),但你可以在数据透视配置中添加你的帖子聚合。根据我的理解,帖子聚合实际上是德鲁伊查询的一部分。

首先,使用pivot:

生成配置文件
pivot --druid your.druid.broker.host:8082 --print-config --with-comments > config.yaml

然后修改config.yaml。语法完全不同,但您可以很容易地组合聚合器。这是config.yaml文件中提供的示例:

  # This is the place where you might want to add derived measures (a.k.a Post Aggregators).
  #
  # Here are some examples of possible derived measures:
  #
  # - name: ecpm
  #   title: eCPM
  #   expression: $main.sum($revenue) / $main.sum($impressions) * 1000
  #
  # - name: usa_revenue
  #   title: USA Revenue
  #   expression: $main.filter($country == 'United States').sum($revenue)

最后,使用--config标志

运行pivot
pivot --config config.yaml

希望它有所帮助! :)