我正在尝试使用字数计数窗口计数。它工作正常,但输出部分不可读。
代码:
StringSerializer stringSerializer = new StringSerializer();
StringDeserializer stringDeserializer = new StringDeserializer();
WindowedSerializer<String> windowedSerializer = new WindowedSerializer<>(stringSerializer);
WindowedDeserializer<String> windowedDeserializer = new WindowedDeserializer<>(stringDeserializer);
Serde<Windowed<String>> windowedSerde = Serdes.serdeFrom(windowedSerializer, windowedDeserializer);
TimeWindows window = TimeWindows.of(TimeUnit.MINUTES.toMillis(1)).advanceBy(TimeUnit.MINUTES.toMillis(1));
KStream<String, String> textLines = builder.stream("streams-plaintext-input");
KTable<Windowed<String>, Long> wordCounts = textLines
.flatMapValues(textLine -> Arrays.asList(textLine.toLowerCase().split("\\W+")))
.groupBy((key, word) -> word)
.windowedBy(window)
.count(Materialized.<String, Long, WindowStore<Bytes, byte[]>>as("counts-store"));
wordCounts.toStream().to("streams-plaintext-output", Produced.with(windowedSerde, Serdes.Long()));
KafkaStreams streams = new KafkaStreams(builder.build(), config);
streams.start();
输出:
kafka c[?? 1
yaya c[?? 1
kafka c[?? 2
我猜不可读的部分可能是窗口持续时间。 我该怎么办才能让它具有可读性?
修改
尝试使用windowedSerde打印输出:
KStream<Windowed<String>, Long> output = builder.stream("streams-plaintext-output");
output.print(windowedSerde, Serdes.Long());
它仍然不起作用。
答案 0 :(得分:0)
从主题中读取时,您需要使用适用于序列化程序的反序列化程序,该序列化程序用于生成该主题。在这种情况下,您需要使用windowDeserializer,您正在构建如下:
WindowedDeserializer<String> windowedDeserializer = new WindowedDeserializer<>(stringDeserializer);