我对KafkaStreams上的密钥反序列化有疑问。具体来说,我使用Kafka Connect和debezium连接器来阅读 Postgres表中的数据。数据被导入Kafka主题,在Kafka Schema Registry上为Key创建了两个Avro模式 和一个值(这包含表中的所有列)。
我在GlobalKTable上阅读这些数据,如下所示:
properties.put(StreamsConfig.DEFAULT_KEY_SERDE_CLASS_CONFIG, SpecificAvroSerde.class);
properties.put(StreamsConfig.DEFAULT_VALUE_SERDE_CLASS_CONFIG, SpecificAvroSerde.class);
GlobalKTable<my.namespace.db.Key, my.namespace.db.Value> tableData = builder.globalTable("topic_name");
我的问题是我有一个拓扑,我需要将这个GlobalKTable与KStream一起加入,如下所示:
SpecificAvroSerde<EventObj> eventsSpecificAvroSerde = new SpecificAvroSerde<>();
eventsSpecificAvroSerde.configure(Collections.singletonMap(KafkaAvroSerializerConfig.SCHEMA_REGISTRY_URL_CONFIG,
conf.getString(" kafka.schema.registry.url")), false);
KStream<Integer, EventObj> events = builder.stream( "another_topic_name",Consumed.with(Serdes.Integer(),eventsSpecificAvroSerde))
请注意,my.namespace.db.Key的Avro架构为
{
"type": "record",
"name": "Key",
"namespace":"my.namespace.db",
"fields": [
{
"name": "id",
"type": "int"
}
]
}
显然,GlobalKTable和KStream上的键是一个不同的对象,我不知道如何实现 加入。我最初尝试过这个但是没有用。
events.join(tableData,
(key,val) -> {return my.namespace.db.Key.newBuilder().setId(key).build();})
/* To convert the Integer Key in KStream to the Avro Object Key
on GlobalKTable as to achieve the join */
(ev,tData) -> ... );
我得到的输出是以下内容,我可以在我的一个联接主题上看到一个WARN(这似乎是可疑的)但是没有其他任何连接实体的输出,它就好像没有什么可以消耗的。
INFO [Consumer clientId=kafka-streams-0401c29c-30a9-4969-93f9-5a83b3c834b4-StreamThread-1-consumer, groupId=kafka-streams] (Re-)joining group (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:336)
INFO stream-thread [kafka-streams-0401c29c-30a9-4969-93f9-5a83b3c834b4-StreamThread-1-consumer] Assigned tasks to clients as {0401c29c-30a9-4969-93f9-5a83b3c834b4=[activeTasks: ([0_0]) standbyTasks: ([]) assignedTasks: ([0_0]) prevActiveTasks: ([]) prevAssignedTasks: ([]) capacity: 1]}. (org.apache.kafka.streams.processor.internals.StreamPartitionAssignor:341)
WARN [Consumer clientId=kafka-streams-0401c29c-30a9-4969-93f9-5a83b3c834b4-StreamThread-1-consumer, groupId=kafka-streams] The following subscribed topics are not assigned to any members: [my-topic] (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:241)
INFO [Consumer clientId=kafka-streams-0401c29c-30a9-4969-93f9-5a83b3c834b4-StreamThread-1-consumer, groupId=kafka-streams] Successfully joined group with generation 1 (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:341)
INFO [Consumer clientId=kafka-streams-0401c29c-30a9-4969-93f9-5a83b3c834b4-StreamThread-1-consumer, groupId=kafka-streams] Setting newly assigned partitions [mip-events-2-0] (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:341)
INFO stream-thread [kafka-streams-0401c29c-30a9-4969-93f9-5a83b3c834b4-StreamThread-1] State transition from PARTITIONS_REVOKED to PARTITIONS_ASSIGNED (org.apache.kafka.streams.processor.internals.StreamThread:346)
INFO KafkaAvroSerializerConfig values:
schema.registry.url = [http://kafka-schema-registry:8081]
auto.register.schemas = true
max.schemas.per.subject = 1000
(io.confluent.kafka.serializers.KafkaAvroSerializerConfig:175)
INFO KafkaAvroDeserializerConfig values:
schema.registry.url = [http://kafka-schema-registry:8081]
auto.register.schemas = true
max.schemas.per.subject = 1000
specific.avro.reader = true
(io.confluent.kafka.serializers.KafkaAvroDeserializerConfig:175)
INFO KafkaAvroSerializerConfig values:
schema.registry.url = [http://kafka-schema-registry:8081]
auto.register.schemas = true
max.schemas.per.subject = 1000
(io.confluent.kafka.serializers.KafkaAvroSerializerConfig:175)
INFO KafkaAvroDeserializerConfig values:
schema.registry.url = [http://kafka-schema-registry:8081]
auto.register.schemas = true
max.schemas.per.subject = 1000
specific.avro.reader = true
(io.confluent.kafka.serializers.KafkaAvroDeserializerConfig:175)
INFO stream-thread [kafka-streams-0401c29c-30a9-4969-93f9-5a83b3c834b4-StreamThread-1] partition assignment took 10 ms.
current active tasks: [0_0]
current standby tasks: []
previous active tasks: []
(org.apache.kafka.streams.processor.internals.StreamThread:351)
INFO stream-thread [kafka-streams-0401c29c-30a9-4969-93f9-5a83b3c834b4-StreamThread-1] State transition from PARTITIONS_ASSIGNED to RUNNING (org.apache.kafka.streams.processor.internals.StreamThread:346)
INFO stream-client [kafka-streams-0401c29c-30a9-4969-93f9-5a83b3c834b4]State transition from REBALANCING to RUNNING (org.apache.kafka.streams.KafkaStreams:346)
我可以在Kafka Streams上进行此加入吗?
请注意,如果我使用KTable读取主题并使用selectKey,则此方法有效
KStream转换密钥但我想避免重新分区。
或者,正确的方法是以另一种方式从数据库导入我的数据,以避免创建Avro对象和
如何使用debezium连接器和KafkaConnect与AvroConverter启用?