如何在kafka流中聚合多个json字段

时间:2018-07-14 23:08:10

标签: apache-kafka apache-kafka-streams

在下面的代码中,我想汇总“作业”和“国家”字段上的记录。目前,我只能汇总“职位”属性。

final Serializer<JsonNode> jsonSerializer = new JsonSerializer();
final Deserializer<JsonNode> jsonDeserializer = new JsonDeserializer();
final Serde<JsonNode> jsonSerde = Serdes.serdeFrom(jsonSerializer, jsonDeserializer);

KStream<String, JsonNode> personDetail = builder.stream("person-streams-input", Consumed.with(Serdes.String(), jsonSerde));
KTable<String,Long> articleagg = personDetail
                .groupBy((key,value) -> value.get("job").asText(), Serialized.with(Serdes.String(), jsonSerde))
                .count();

示例JSON:

{
  "name": "abc",
  "zipcode": "111111",
  "job": "engineer",
  "country": "USA"
}

1 个答案:

答案 0 :(得分:1)

您可以根据消息键和值建立任何groupBy条件:

KTable<String,Long> articleagg = personDetail
                .groupBy((key,value) -> getGroupByCondition(value), Serialized.with(Serdes.String(), jsonSerde))
                .count();


 private static String getGroupByCondition(JsonNode value) {
        return value.get("job").asText() + "_" + value.get("country").asText();
 }