在使用kafka流和Spring Cloud流方面,我是一个相对较新的人,在这里使用窗口聚合功能比较困难。
我想做的是
我的代码就是这样:
@EnableBinding(KafkaStreamsProcessor::class)
inner class SessionProcessorApplication {
@StreamListener("input")
@SendTo("output")
fun process(input: KStream<*, UserInteractionEvent>): KStream<*, UserSession> {
return input
.groupBy({ _, v -> v.userProjectId }, Serialized.with(Serdes.String(), UserInteractionEventSerde()))
.windowedBy(SessionWindows.with(TimeUnit.MINUTES.toMillis(15)))
.aggregate(
Initializer<Session>(::Session),
Aggregator<String, UserInteractionEvent, Session> { _, event, session -> session.interactions + event.interaction; session },
Merger<String, Session> { _, session1, session2 -> Session.merge(session1, session2)},
Materialized.`as`<String, Session, SessionStore<Bytes, ByteArray>>("windowed-sessions")
.withKeySerde(Serdes.String()).withValueSerde(SessionSerde()))
.toStream()
.map { windowed, session ->
KeyValue(windowed.key(),
UserSession(windowed.key(),
session.interactions,
Instant.ofEpochSecond(windowed.window().start()),
Instant.ofEpochSecond(windowed.window().end()))
)
}
}
}
我似乎在汇总部分遇到了问题。 尝试刷新窗口会话存储时看到类强制转换异常。 我很困惑如何从这里开始。 如果有人能指出我要去哪里,或者有一些有关使用带有自定义SERDES的会话窗口的文档,我将不胜感激!
非常感谢!
下面的完整堆栈跟踪:
线程“ default-dc0af3aa-8d8d-4b51-b0de-cdeb2dd83db6-StreamThread-1”中的异常org.apache.kafka.streams.errors.ProcessorStateException:任务[1_0]无法刷新状态存储窗口会话 在org.apache.kafka.streams.processor.internals.ProcessorStateManager.flush(ProcessorStateManager.java:245) 在org.apache.kafka.streams.processor.internals.AbstractTask.flushState(AbstractTask.java:196) 在org.apache.kafka.streams.processor.internals.StreamTask.flushState(StreamTask.java:327) 在org.apache.kafka.streams.processor.internals.StreamTask $ 1.run(StreamTask.java:307) 在org.apache.kafka.streams.processor.internals.StreamsMetricsImpl.measureLatencyNs(StreamsMetricsImpl.java:208) 在org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:302) 在org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:292) 在org.apache.kafka.streams.processor.internals.AssignedTasks $ 2.apply(AssignedTasks.java:87) 在org.apache.kafka.streams.processor.internals.AssignedTasks.applyToRunningTasks(AssignedTasks.java:452) 在org.apache.kafka.streams.processor.internals.AssignedTasks.commit(AssignedTasks.java:381) 在org.apache.kafka.streams.processor.internals.TaskManager.commitAll(TaskManager.java:310) 在org.apache.kafka.streams.processor.internals.StreamThread.maybeCommit(StreamThread.java:1018) 在org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:835) 在org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:774) 在org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:744) 引起原因:org.apache.kafka.streams.errors.StreamsException:序列化程序(键:org.apache.kafka.common.serialization.ByteArraySerializer /值:org.apache.kafka.common.serialization.ByteArraySerializer)与以下版本不兼容实际的键或值类型(键类型:java.lang.String /值类型:[B])。更改StreamConfig中的默认Serdes或通过方法参数提供正确的Serdes。 在org.apache.kafka.streams.processor.internals.SinkNode.process(SinkNode.java:91) 在org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:85) 在org.apache.kafka.streams.kstream.internals.KStreamMap $ KStreamMapProcessor.process(KStreamMap.java:42) 在org.apache.kafka.streams.processor.internals.ProcessorNode $ 1.run(ProcessorNode.java:46) 在org.apache.kafka.streams.processor.internals.StreamsMetricsImpl.measureLatencyNs(StreamsMetricsImpl.java:208) 在org.apache.kafka.streams.processor.internals.ProcessorNode.process(ProcessorNode.java:124) 在org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:85) 在org.apache.kafka.streams.kstream.internals.KStreamMap $ KStreamMapProcessor.process(KStreamMap.java:42) 在org.apache.kafka.streams.processor.internals.ProcessorNode $ 1.run(ProcessorNode.java:46) 在org.apache.kafka.streams.processor.internals.StreamsMetricsImpl.measureLatencyNs(StreamsMetricsImpl.java:208) 在org.apache.kafka.streams.processor.internals.ProcessorNode.process(ProcessorNode.java:124) 在org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:85) 在org.apache.kafka.streams.kstream.internals.KStreamMap $ KStreamMapProcessor.process(KStreamMap.java:42) 在org.apache.kafka.streams.processor.internals.ProcessorNode $ 1.run(ProcessorNode.java:46) 在org.apache.kafka.streams.processor.internals.StreamsMetricsImpl.measureLatencyNs(StreamsMetricsImpl.java:208) 在org.apache.kafka.streams.processor.internals.ProcessorNode.process(ProcessorNode.java:124) 在org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:85) 在org.apache.kafka.streams.kstream.internals.KStreamMapValues $ KStreamMapProcessor.process(KStreamMapValues.java:41) 在org.apache.kafka.streams.processor.internals.ProcessorNode $ 1.run(ProcessorNode.java:46) 在org.apache.kafka.streams.processor.internals.StreamsMetricsImpl.measureLatencyNs(StreamsMetricsImpl.java:208) 在org.apache.kafka.streams.processor.internals.ProcessorNode.process(ProcessorNode.java:124) 在org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:85) 在org.apache.kafka.streams.kstream.internals.ForwardingCacheFlushListener.apply(ForwardingCacheFlushListener.java:42) 在org.apache.kafka.streams.state.internals.CachingSessionStore.putAndMaybeForward(CachingSessionStore.java:176) 在org.apache.kafka.streams.state.internals.CachingSessionStore.access处$ 000(CachingSessionStore.java:38) 在org.apache.kafka.streams.state.internals.CachingSessionStore $ 1.apply(CachingSessionStore.java:88) 在org.apache.kafka.streams.state.internals.NamedCache.flush(NamedCache.java:141) 在org.apache.kafka.streams.state.internals.NamedCache.flush(NamedCache.java:99) 在org.apache.kafka.streams.state.internals.ThreadCache.flush(ThreadCache.java:127) 在org.apache.kafka.streams.state.internals.CachingSessionStore.flush(CachingSessionStore.java:196) 在org.apache.kafka.streams.state.internals.MeteredSessionStore.flush(MeteredSessionStore.java:165) 在org.apache.kafka.streams.processor.internals.ProcessorStateManager.flush(ProcessorStateManager.java:242) ...另外14个 原因:java.lang.ClassCastException:java.lang.String无法转换为[B 在org.apache.kafka.common.serialization.ByteArraySerializer.serialize(ByteArraySerializer.java:21) 在org.apache.kafka.streams.processor.internals.RecordCollectorImpl.send(RecordCollectorImpl.java:90) 在org.apache.kafka.streams.processor.internals.RecordCollectorImpl.send(RecordCollectorImpl.java:78) 在org.apache.kafka.streams.processor.internals.SinkNode.process(SinkNode.java:87) ...另外45个
我的配置:
spring.cloud.stream.kafka.streams.bindings:
default.key.serde: org.apache.kafka.common.serialization.Serdes$StringSerde
default.value.serde: org.apache.kafka.common.serialization.Serdes$StringSerde
input:
consumer:
valueSerde: com.teckro.analytics.UserInteractionEventSerde
output:
producer:
valueSerde: com.teckro.analytics.UserSessionSerde
spring.cloud.stream.bindings:
input:
destination: test-interaction
consumer:
headerMode: raw
output:
destination: test-session
producer:
headerMode: raw
答案 0 :(得分:1)
我发现您的配置存在一些问题。
默认Serde
的配置方式应如下更改:
spring.cloud.stream.kafka.streams.binder.configuration:
default.key.serde: org.apache.kafka.common.serialization.Serdes$StringSerde
default.value.serde: org.apache.kafka.common.serialization.Serdes$StringSerde
spring.cloud.stream.kafka.streams.bindings:
input:
consumer:
valueSerde: com.teckro.analytics.UserInteractionEventSerde
output:
producer:
valueSerde: com.teckro.analytics.UserSessionSerde
似乎您正在使用本机Serde进行所有反序列化。您想将其包括在配置中。默认情况下,绑定程序进行输入/输出序列化。
spring.cloud.stream.bindings:
input:
destination: test-interaction
consumer:
useNativeDecoding: true
output:
destination: test-session
producer:
useNativeEncoding: true
如果问题仍然存在,请在Github上创建一个简单的示例项目,并与我们分享。我们来看一下。