Kafka Stream:Record&骨料

时间:2018-05-04 05:32:32

标签: apache-kafka apache-kafka-streams

[
    {
        "device_nm": "x1",
        "type": "external",
        "mtrc1": 100,
        "mtrc2": 25,
        "starttime": "2018-05-04 01:00:00",
        "model": "t20"
    },
    {
        "device_nm": "x1",
        "type": "external",
        "mtrc1": 5,
        "mtrc2": 11,
        "starttime": "2018-05-04 02:00:00",
        "model": "t20"
    },
    {
        "device_nm": "x1",
        "type": "internal",
        "mtrc1": 35,
        "mtrc2": 15,
        "starttime": "2018-05-04 01:00:00",
        "model": "t40"
    },
    {
        "device_nm": "x1",
        "type": "internal",
        "mtrc1": 53,
        "mtrc2": 22,
        "starttime": "2018-05-04 02:00:00",
        "model": "t40"
    }
]

假设每个Kafka消息包含一组JSON对象,想要使用KStream / KTable按device_nm执行分组,键入,截断(starttime)和聚合mtrc1,mtrc2。输出应如下:

[
    {
        "device_nm": "x1",
        "type": "external",
        "mtrc1": 105,
        "mtrc2": 36,
        "date": "2018-05-04",
        "model": "t20"
    },
    {
        "device_nm": "x1",
        "type": "internal",
        "mtrc1": 88,
        "mtrc2": 37,
        "date": "2018-05-04",
        "model": "t40"
    }
]

我们如何使用聚合API但保留所有属性?

1 个答案:

答案 0 :(得分:1)

KafkaStream的groupby,Key的聚合基础, 所以你必须使用Multiple Key制作密钥

public class MKey {
    String device;
    String type;
}

public class MBody {
    int mtric1;
    int mricc2;
}

KStream<MKey, MBody> stream =
        sb.stream("topic",
                  Consumed.with(new JsonSerde<>(MKey.class), new JsonSerde<>(MBody.class, objectMapper)));
    stream.groupByKey()
          .aggregate(MBody::new,
                     (key, value, aggr) -> {
                         aggr.mricc2 += value.mricc2;
                         aggr.mtric1 += value.mtric1;
                         return aggr;
                     });