我想在一段时间内使用Kafka Streams按ID连接日志。
就目前而言,我可以成功计算出具有相同ID(注释代码)的日志的数量。
但是,当我用.count
替换.aggregate
方法时,会遇到以下错误:
"Failed to flush state store time-windowed-aggregation-stream-store"
Caused by: java.lang.ClassCastException: org.apache.kafka.streams.kstream.Windowed cannot be cast to java.lang.String
我对此并不陌生,无法弄清楚该错误的原因,我认为拥有.withValueSerde(Serdes.String())
应该可以避免这种情况。
在我的代码下面:
package myapps;
import java.time.Duration;
import java.util.Properties;
import java.util.concurrent.CountDownLatch;
import org.apache.kafka.common.serialization.Serdes;
import org.apache.kafka.common.utils.Bytes;
import org.apache.kafka.streams.KafkaStreams;
import org.apache.kafka.streams.StreamsBuilder;
import org.apache.kafka.streams.StreamsConfig;
import org.apache.kafka.streams.Topology;
import org.apache.kafka.streams.kstream.*;
import org.apache.kafka.streams.kstream.KStream;
import org.apache.kafka.streams.kstream.Suppressed.*;
import org.apache.kafka.streams.state.WindowStore;
public class MyCode {
public static void main(String[] args) throws Exception {
Properties props = new Properties();
props.put(StreamsConfig.APPLICATION_ID_CONFIG, "streams-mycode");
props.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
props.put(StreamsConfig.DEFAULT_KEY_SERDE_CLASS_CONFIG, Serdes.String().getClass());
props.put(StreamsConfig.DEFAULT_VALUE_SERDE_CLASS_CONFIG, Serdes.String().getClass());
final StreamsBuilder builder = new StreamsBuilder();
KStream<String, String> source = builder.stream("streams-plaintext-input");
KStream<String, String> changedKeyStream = source.selectKey((k, v)
-> v.substring(v.indexOf("mid="),v.indexOf("mid=")+8));
/* // Working code for count
changedKeyStream
.groupByKey()
.windowedBy(TimeWindows.of(Duration.ofSeconds(3))
.grace(Duration.ofSeconds(2)))
.count(Materialized.with(Serdes.String(), Serdes.Long())) // could be replaced with an aggregator (reducer?) ?
.suppress(Suppressed.untilWindowCloses(BufferConfig.unbounded()))
.toStream()
.print(Printed.toSysOut());
*/
changedKeyStream
.groupByKey()
.windowedBy(TimeWindows.of(Duration.ofSeconds(3)))
.aggregate(
String::new, (String k, String v, String Result) -> { return Result+"\n"+v; },
Materialized.<String, String, WindowStore<Bytes, byte[]>>as("time-windowed-aggregated-stream-store") /* state store name */
.withValueSerde(Serdes.String())) /* serde for aggregate value */
.suppress(Suppressed.untilWindowCloses(BufferConfig.unbounded()))
.toStream()
.print(Printed.toSysOut());
changedKeyStream.to("streams-mycode-output", Produced.with(Serdes.String(), Serdes.String()));
final Topology topology = builder.build();
final KafkaStreams streams = new KafkaStreams(topology, props);
final CountDownLatch latch = new CountDownLatch(1);
// attach shutdown handler to catch control-c
Runtime.getRuntime().addShutdownHook(new Thread("streams-shutdown-hook") {
@Override
public void run() {
streams.close();
latch.countDown();
}
});
// launch until control+c
try {
streams.start();
latch.await();
} catch (Throwable e) {
System.out.print("Something went wrong!");
System.exit(1);
}
System.exit(0);
}
}
预先感谢您的帮助。
答案 0 :(得分:0)
有两种解决方法:
org.apache.kafka.streams.kstream.Grouped
传递到KStream::groupByKey
。org.apache.kafka.common.serialization.Serde
设置为实体化-Materialized::withKeySerde(...)
下面的示例代码:
广告1。
changedKeyStream
.groupByKey(Grouped.with(Serdes.String(), Serdes.String()))
.windowedBy(TimeWindows.of(Duration.ofSeconds(3)))
广告2。
changedKeyStream
.groupByKey()
.windowedBy(TimeWindows.of(Duration.ofSeconds(3)))
.aggregate(
String::new, (String k, String v, String Result) -> { return Result+"_"+v; },
Materialized.<String, String, WindowStore<Bytes, byte[]>>as("time-windowed-aggregated-stream-store") /* state store name */
.withValueSerde(Serdes.String())
.withKeySerde(Serdes.String())
)