通过Kafka Streams DSL多次使用同一主题作为来源

时间:2018-09-20 13:50:56

标签: apache-kafka apache-kafka-streams

在使用Kafka Streams DSL时,是否可以将相同的主题用作两个不同处理例程的来源?

StreamsBuilder streamsBuilder = new StreamsBuilder();

// use the topic as a stream
streamsBuilder.stream("topic")...

// use the same topic as a source for KTable
streamsBuilder.table("topic")...

return streamsBuilder.build();

天真的实现会在运行时抛出TopologyException无效的拓扑:主题已经被另一个来源注册。如果我们深入研究底层的处理器API,这是完全有效的。使用它是唯一的出路吗?

更新: 到目前为止,我找到的最接近的替代方法:

StreamsBuilder streamsBuilder = new StreamsBuilder();

final KStream<Object, Object> stream = streamsBuilder.stream("topic");

// use the topic as a stream
stream...

// create a KTable from the KStream
stream.groupByKey().reduce((oldValue, newValue) -> newValue)...

return streamsBuilder.build();

2 个答案:

答案 0 :(得分:1)

读取与流和表相同的主题在语义上是有问题的恕我直言。流模型不可变的事实,而您将用于读取KTable模型更新的changelog主题。

如果您想在多个流中使用单个主题,则可以多次重用同一KStream对象(从语义上来说,类似于广播):

KStream stream = ...
stream.filter();
stream.map();

还可以比较:https://issues.apache.org/jira/browse/KAFKA-6687(有计划取消此限制。我怀疑,尽管如此,我们还是允许同时使用一个主题作为KStreamKTable -比较我上面的评论)。

答案 1 :(得分:0)

是的,可以,但是为此您需要有多个StreamsBuilder

StreamsBuilder streamsBuilder1 = new StreamsBuilder();
streamsBuilder1.stream("topic");

StreamsBuilder streamsBuilder2 = new StreamsBuilder();
streamsBuilder2.table("topic");

Topology topology1 = streamsBuilder1.build();
Topology topology2 = streamsBuilder2.build();

KafkaStreams kafkaStreams1 = new KafkaStreams(topology1, streamsConfig1);
KafkaStreams kafkaStreams2 = new KafkaStreams(topology2, streamsConfig2);

还要确保每个application.id的{​​{1}}值都不同