KTable值字段上的Kafka流分组

时间:2019-05-16 18:24:20

标签: apache-kafka apache-kafka-streams

我有一个用例,我的KTable就是这样。

KTable :orderTable

键:值

{123} : {id1,12}

{124} : {id2,10}

{125} : {id1,5}

{126} : {id2,11}

KTable orderByIdTable =>该表位于groupBy值field (id)上,并且计数列值的总和为{{1 }},id1=(12+5)

键:值

id2=(10+11)

{id1} : {17}

{id2}  : {21}

1 个答案:

答案 0 :(得分:2)

这里是一个代码示例(仅使用Java原语类型,这使我更快地组合起来),演示了如何对KTable进行密钥重新命名(即重新分区),从而生成新的KTable。您应该能够轻松地将其适应于将KTable<String, Order>变成KTable<String, Long>的示例。

我个人会为您的用例选择Variant 2。

以下示例。 未经充分测试,可能是逻辑删除记录(具有非空键但值为空值的消息,表示应从表中删除该键)未得到正确处理。

final StreamsBuilder builder = new StreamsBuilder();
final KTable<Integer, String> table = builder.table(inputTopic, Consumed.with(Serdes.Integer(), Serdes.String()));

// Variant 1 (https://docs.confluent.io/current/streams/faq.html#option-1-write-kstream-to-ak-read-back-as-ktable)
// Here, we re-key the KTable, write the results to a new topic, and then re-read that topic into a new KTable.
table
    .toStream()
    .map((key, value) -> KeyValue.pair(value, key))
    .to(outputTopic1, Produced.with(Serdes.String(), Serdes.Integer()));
KTable<String, Integer> rekeyedTable1 =
    builder.table(outputTopic1, Consumed.with(Serdes.String(), Serdes.Integer()));

// Variant 2 (https://docs.confluent.io/current/streams/faq.html#option-2-perform-a-dummy-aggregation)
// Here, we re-key the KTable (resulting in a KGroupedTable), and then perform a dummy aggregation to turn the
// KGroupedTable into a KTable.
final KTable<String, Integer> rekeyedTable2 =
    table
        .groupBy(
            (key, value) -> KeyValue.pair(value, key),
            Grouped.with(Serdes.String(), Serdes.Integer())
        )
        // Dummy aggregation
        .reduce(
            (aggValue, newValue) -> newValue, /* adder */
            (aggValue, oldValue) -> oldValue  /* subtractor */
        );
rekeyedTable2.toStream().to(outputTopic2, Produced.with(Serdes.String(), Serdes.Integer()));