我有一个java Kafka流应用程序,它从一个主题中读取一些过滤和转换,并将数据写回Kafka到另一个主题。 我在每一步都打印流对象。 我注意到如果我向输入主题发送了超过几十条记录,那么我的Kafka流应用程序会消耗一些记录 。
当使用kafka-console-consumer.sh从输入主题中消费时,我确实收到了所有记录。
我正在使用一个代理和一个分区主题运行Kafka 1.0.0。
知道为什么吗?
public static void main(String[] args) {
final String bootstrapServers = System.getenv("KAFKA");
final String inputTopic = System.getenv("INPUT_TOPIC");
final String outputTopic = System.getenv("OUTPUT_TOPIC");
final String gatewayTopic = System.getenv("GATEWAY_TOPIC");
final Properties streamsConfiguration = new Properties();
streamsConfiguration.put(StreamsConfig.APPLICATION_ID_CONFIG, "PreProcess");
streamsConfiguration.put(StreamsConfig.CLIENT_ID_CONFIG, "PreProcess-client");
streamsConfiguration.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
streamsConfiguration.put(StreamsConfig.DEFAULT_KEY_SERDE_CLASS_CONFIG, Serdes.String().getClass().getName());
streamsConfiguration.put(StreamsConfig.DEFAULT_VALUE_SERDE_CLASS_CONFIG, Serdes.String().getClass().getName());
streamsConfiguration.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
streamsConfiguration.put(StreamsConfig.COMMIT_INTERVAL_MS_CONFIG, 300L);
final StreamsBuilder builder = new StreamsBuilder();
final KStream<String, String> textLines = builder.stream(inputTopic);
textLines.print();
StreamsTransformation streamsTransformation = new StreamsTransformation(builder);
KTable<String,Gateway> gatewayKTable = builder.table(gatewayTopic, Consumed.with(Serdes.String(), SerdesUtils.getGatewaySerde()));
KStream<String, Message> gatewayIdMessageKStream = streamsTransformation.getStringMessageKStream(textLines,gatewayKTable);
gatewayIdMessageKStream.print();
KStream<String, FlatSensor> keyFlatSensorKStream = streamsTransformation.transformToKeyFlatSensorKStream(gatewayIdMessageKStream);
keyFlatSensorKStream.to(outputTopic, Produced.with(Serdes.String(), SerdesUtils.getFlatSensorSerde()));
keyFlatSensorKStream.print();
KafkaStreams streams = new KafkaStreams(builder.build(), streamsConfiguration);
streams.cleanUp();
streams.start();
// Add shutdown hook to respond to SIGTERM and gracefully close Kafka Streams
Runtime.getRuntime().addShutdownHook(new Thread(() -> {
streams.close();
}));
}