我正在eclipse上运行一个简单的Kafka Streams程序,该程序已成功运行,但是无法实现开窗概念。
我想处理在5秒钟的窗口中接收到的所有到输出主题的消息。我用谷歌搜索并了解到我需要实现滚动窗口的概念。但是,我看到输出立即发送到输出主题。
我在这里做错了什么?下面是我运行的主要方法:
public static void main(String[] args) throws Exception {
Properties props = new Properties();
props.put(StreamsConfig.APPLICATION_ID_CONFIG, "streams-wordcount");
props.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
props.put(StreamsConfig.CACHE_MAX_BYTES_BUFFERING_CONFIG, 0);
props.put(StreamsConfig.DEFAULT_KEY_SERDE_CLASS_CONFIG, Serdes.String().getClass().getName());
props.put(StreamsConfig.DEFAULT_VALUE_SERDE_CLASS_CONFIG, Serdes.String().getClass().getName());
props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
final StreamsBuilder builder = new StreamsBuilder();
KStream<String, String> source = builder.stream("wc-input");
@SuppressWarnings("deprecation")
KTable<Windowed<String>, Long> counts = source
.flatMapValues(new ValueMapper<String, Iterable<String>>() {
@Override
public Iterable<String> apply(String value) {
return Arrays.asList(value.toLowerCase(Locale.getDefault()).split(" "));
}
})
.groupBy(new KeyValueMapper<String, String, String>() {
@Override
public String apply(String key, String value) {
return value;
}
})
.count(TimeWindows.of(10000L)
.until(10000L),"Counts");
// need to override value serde to Long type
counts.to("wc-output");
final Topology topology = builder.build();
final KafkaStreams streams = new KafkaStreams(topology, props);
final CountDownLatch latch = new CountDownLatch(1);
// attach shutdown handler to catch control-c
Runtime.getRuntime().addShutdownHook(new Thread("streams-wordcount-shutdown-hook") {
@Override
public void run() {
streams.close();
latch.countDown();
}
});
try {
streams.start();
long windowSizeMs = TimeUnit.MINUTES.toMillis(50000); // 5 * 60 * 1000L
TimeWindows.of(windowSizeMs);
TimeWindows.of(windowSizeMs).advanceBy(windowSizeMs);
latch.await();
} catch (Throwable e) {
System.exit(1);
}
System.exit(0);
}
答案 0 :(得分:0)
开窗口并不意味着每个窗口“一个输出”。如果每个窗口仅要获得一个输出,则要对结果suppress()
使用KTable
。
比较这篇文章:https://www.confluent.io/blog/watermarks-tables-event-time-dataflow-model/