我正在处理重复消息可能到达使用者(KStream应用程序)的情况。为了使用这种典型情况,我们假设它是一个OrderCreatedEvent,并且KStream具有处理订单的逻辑。该事件具有订单ID,可帮助我识别重复的消息。
我想做的是:
1)将每个订单添加到持久状态存储中
2)在KStream中处理消息时,查询状态存储以检查是否已接收到消息,在这种情况下不执行任何操作。
val persistentKeyValueStore = Stores.persistentKeyValueStore("order-store")
val stateStore: Materialized<Int, Order, KeyValueStore<Bytes, ByteArray>> =
Materialized.`as`<Int, Order>(persistentKeyValueStore)
.withKeySerde(intSerde)
.withValueSerde(orderSerde)
val orderTable: KTable<Int, Order> = input.groupByKey(Serialized.with(intSerde, orderSerde))
.reduce({ _, y -> y }, stateStore)
var orderStream: KStream<Int, Order> = ...
orderStream.filter { XXX }
.map { key, value ->
processingLogic()
KeyValue(key, value)
}...
我想在filter { XXX }
位中查询状态存储,以检查订单ID是否存在(假设该订单用作键值存储的键),过滤掉已经处理过的订单(在国营商店)。
我的第一个问题是:如何查询KStream DSL中的状态存储,例如在过滤操作中。
第二个问题:在这种情况下,我该如何处理新邮件(未经处理的邮件)的到达?如果KTable在orderStream KStream执行之前将订单持久保存到状态存储中,则该消息将已经在存储中。仅在处理完成后才添加它们。 我怎样才能做到这一点?我可能不应该使用KTable,而是类似以下内容:
orderStream.filter { keystore.get(key) == null }
.map { key, value ->
processingLogic()
KeyValue(key, value)
}
.foreach { key, value ->
keystore.put(key, value);
}
答案 0 :(得分:0)
按照Matthias的指示,我是这样实现的:
DeduplicationTransformer
package com.codependent.outboxpattern.operations.stream
import com.codependent.outboxpattern.account.TransferEmitted
import org.apache.kafka.streams.KeyValue
import org.apache.kafka.streams.kstream.Transformer
import org.apache.kafka.streams.processor.ProcessorContext
import org.apache.kafka.streams.state.KeyValueStore
import org.slf4j.LoggerFactory
@Suppress("UNCHECKED_CAST")
class DeduplicationTransformer : Transformer<String, TransferEmitted, KeyValue<String, TransferEmitted>> {
private val logger = LoggerFactory.getLogger(javaClass)
private lateinit var dedupStore: KeyValueStore<String, String>
private lateinit var context: ProcessorContext
override fun init(context: ProcessorContext) {
this.context = context
dedupStore = context.getStateStore(DEDUP_STORE) as KeyValueStore<String, String>
}
override fun transform(key: String, value: TransferEmitted): KeyValue<String, TransferEmitted>? {
return if (isDuplicate(key)) {
logger.warn("****** Detected duplicated transfer {}", key)
null
} else {
logger.warn("****** Registering transfer {}", key)
dedupStore.put(key, key)
KeyValue(key, value)
}
}
private fun isDuplicate(key: String) = dedupStore[key] != null
override fun close() {
}
}
FraudKafkaStreamsConfiguration
const val DEDUP_STORE = "dedup-store"
@Suppress("UNCHECKED_CAST")
@EnableBinding(TransferKafkaStreamsProcessor::class)
class FraudKafkaStreamsConfiguration(private val fraudDetectionService: FraudDetectionService) {
private val logger = LoggerFactory.getLogger(javaClass)
@KafkaStreamsStateStore(name = DEDUP_STORE, type = KafkaStreamsStateStoreProperties.StoreType.KEYVALUE)
@StreamListener
@SendTo(value = ["outputKo", "outputOk"])
fun process(@Input("input") input: KStream<String, TransferEmitted>): Array<KStream<String, *>>? {
val fork: Array<KStream<String, *>> = input
.transform(TransformerSupplier { DeduplicationTransformer() }, DEDUP_STORE)
.branch(Predicate { _: String, value -> fraudDetectionService.isFraudulent(value) },
Predicate { _: String, value -> !fraudDetectionService.isFraudulent(value) }) as Array<KStream<String, *>>
...