我试图了解KTables应该处于什么状态。
我正在读interactive queries documentation,上面写着
此商店将保存在上找到的任何单词的最新计数 主题为“字数输入”。
比方说,一条消息已成功发送到主题T
这article说:
KTable查找始终在KTable的当前状态下进行; 因此,乱序记录可能会产生不确定的结果。
鉴于:
3。是否保证联接操作总是在聚合之后发生,所以每个新记录总是先聚合然后联接,再也不会相反?
已添加:
我的主要问题是何时可以通过交互式查询或联接请求数据。是否有可能获取过时的数据/是否有滞后的可能性?
答案 0 :(得分:2)
如果消息已成功发送到主题T。
是的,状态存储始终使用给定密钥的最新值进行更新。 builder.table("T",...)
收到有关现有键的新更新后,交互式查询将返回一个新值。
是的,将基于新更新来更新以某种方式与主题T链接的所有状态存储。 builder.stream("T").groupByKey().aggregate(...) )
还将更新下面的状态存储。
对于S1 = builder.stream("T"), T1 = S1.groupByKey().aggregate(...), S2 =S1.join(T1)
,它遵循Stream-Table连接语义。 KTable上的更新将始终更新内部右侧的连接状态,但是仅在Stream(左侧)上有一些新记录时才会触发连接操作
这是KStream-KTable连接语义的一个很好的例子: https://kafka.apache.org/20/documentation/streams/developer-guide/dsl-api.html#kstream-ktable-join
答案 1 :(得分:0)
经过调试后,找到了第三部分的答案:
这里是一个例子:
测试类:
package myapps;
import org.apache.kafka.clients.producer.ProducerRecord;
import org.apache.kafka.common.serialization.LongSerializer;
import org.apache.kafka.common.serialization.Serdes;
import org.apache.kafka.common.serialization.StringDeserializer;
import org.apache.kafka.common.serialization.StringSerializer;
import org.apache.kafka.streams.StreamsBuilder;
import org.apache.kafka.streams.StreamsConfig;
import org.apache.kafka.streams.Topology;
import org.apache.kafka.streams.TopologyTestDriver;
import org.apache.kafka.streams.kstream.KStream;
import org.apache.kafka.streams.kstream.KTable;
import org.apache.kafka.streams.kstream.Materialized;
import org.apache.kafka.streams.kstream.Produced;
import org.apache.kafka.streams.test.ConsumerRecordFactory;
import org.junit.Assert;
import org.junit.Test;
import java.util.Properties;
public class TopologyTest {
private static final String INPUT_TOPIC = "input-topic";
private static final String OUTPUT_TOPIC = "output-topic";
@Test
public void testStreams() {
Topology topology = createTopology();
Properties config = new Properties();
config.put(StreamsConfig.APPLICATION_ID_CONFIG, "test");
config.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, "dummy:1234");
config.put(StreamsConfig.DEFAULT_KEY_SERDE_CLASS_CONFIG, Serdes.String().getClass().getName());
config.put(StreamsConfig.DEFAULT_VALUE_SERDE_CLASS_CONFIG, Serdes.Long().getClass().getName());
try (TopologyTestDriver testDriver = new TopologyTestDriver(topology, config)) {
ConsumerRecordFactory<String, Long> factory = new ConsumerRecordFactory<>(
INPUT_TOPIC, new StringSerializer(), new LongSerializer());
testDriver.pipeInput(factory.create(INPUT_TOPIC, "key", 1L));
testDriver.pipeInput(factory.create(INPUT_TOPIC, "key", 2L));
testDriver.pipeInput(factory.create(INPUT_TOPIC, "key", 3L));
ProducerRecord<String, String> pr1 = testDriver.readOutput(OUTPUT_TOPIC, new StringDeserializer(), new StringDeserializer());
ProducerRecord<String, String> pr2 = testDriver.readOutput(OUTPUT_TOPIC, new StringDeserializer(), new StringDeserializer());
ProducerRecord<String, String> pr3 = testDriver.readOutput(OUTPUT_TOPIC, new StringDeserializer(), new StringDeserializer());
Assert.assertEquals("1,1", pr1.value());
Assert.assertEquals("2,3", pr2.value());
Assert.assertEquals("3,6", pr3.value());
}
}
private Topology createTopology() {
StreamsBuilder builder = new StreamsBuilder();
KStream<String, Long> inputStream = builder.stream(INPUT_TOPIC);
KTable<String, Long> table = inputStream.groupByKey().aggregate(
() -> 0L,
(key, value, aggregate) -> value + aggregate,
Materialized.as("store")
);
KStream<String, String> joined = inputStream
.join(table, (value, aggregate) -> value + "," + aggregate);
joined.to(OUTPUT_TOPIC, Produced.with(Serdes.String(), Serdes.String()));
return builder.build();
}
}
pom.xml
<dependencies>
<!-- Apache Kafka dependencies -->
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka-streams</artifactId>
<version>2.3.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/junit/junit -->
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.12</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka-streams-test-utils</artifactId>
<version>2.3.0</version>
<scope>test</scope>
</dependency>
</dependencies>