使用加入密钥

时间:2018-02-23 08:24:56

标签: apache-kafka-streams apache-kafka-connect debezium

我对KafkaStreams上的密钥反序列化有疑问。具体来说,我使用Kafka Connect和debezium连接器来阅读 Postgres表中的数据。数据被导入Kafka主题,在Kafka Schema Registry上为Key创建了两个Avro模式 和一个值(这包含表中的所有列)。

我在GlobalKTable上阅读这些数据,如下所示:

properties.put(StreamsConfig.DEFAULT_KEY_SERDE_CLASS_CONFIG, SpecificAvroSerde.class);
properties.put(StreamsConfig.DEFAULT_VALUE_SERDE_CLASS_CONFIG, SpecificAvroSerde.class);

GlobalKTable<my.namespace.db.Key, my.namespace.db.Value> tableData = builder.globalTable("topic_name");

我的问题是我有一个拓扑,我需要将这个GlobalKTable与KStream一起加入,如下所示:

SpecificAvroSerde<EventObj> eventsSpecificAvroSerde = new SpecificAvroSerde<>();
eventsSpecificAvroSerde.configure(Collections.singletonMap(KafkaAvroSerializerConfig.SCHEMA_REGISTRY_URL_CONFIG,
     conf.getString(" kafka.schema.registry.url")), false);

KStream<Integer, EventObj> events = builder.stream( "another_topic_name",Consumed.with(Serdes.Integer(),eventsSpecificAvroSerde))

请注意,my.namespace.db.Key的Avro架构为

{
  "type": "record",
  "name": "Key",
  "namespace":"my.namespace.db",
  "fields": [
    {
      "name": "id",
      "type": "int"
    }
  ]
}

显然,GlobalKTable和KStream上的键是一个不同的对象,我不知道如何实现 加入。我最初尝试过这个但是没有用。

events.join(tableData,
(key,val) -> {return  my.namespace.db.Key.newBuilder().setId(key).build();}) 
 /* To convert the Integer Key in KStream to the Avro Object Key 
  on GlobalKTable as to achieve the join */
(ev,tData) -> ... );

我得到的输出是以下内容,我可以在我的一个联接主题上看到一个WARN(这似乎是可疑的)但是没有其他任何连接实体的输出,它就好像没有什么可以消耗的。

INFO [Consumer clientId=kafka-streams-0401c29c-30a9-4969-93f9-5a83b3c834b4-StreamThread-1-consumer, groupId=kafka-streams] (Re-)joining group (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:336) 
INFO stream-thread [kafka-streams-0401c29c-30a9-4969-93f9-5a83b3c834b4-StreamThread-1-consumer] Assigned tasks to clients as {0401c29c-30a9-4969-93f9-5a83b3c834b4=[activeTasks: ([0_0]) standbyTasks: ([]) assignedTasks: ([0_0]) prevActiveTasks: ([]) prevAssignedTasks: ([]) capacity: 1]}. (org.apache.kafka.streams.processor.internals.StreamPartitionAssignor:341) 
WARN [Consumer clientId=kafka-streams-0401c29c-30a9-4969-93f9-5a83b3c834b4-StreamThread-1-consumer, groupId=kafka-streams] The following subscribed topics are not assigned to any members: [my-topic]  (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:241) 
INFO [Consumer clientId=kafka-streams-0401c29c-30a9-4969-93f9-5a83b3c834b4-StreamThread-1-consumer, groupId=kafka-streams] Successfully joined group with generation 1 (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:341) 
INFO [Consumer clientId=kafka-streams-0401c29c-30a9-4969-93f9-5a83b3c834b4-StreamThread-1-consumer, groupId=kafka-streams] Setting newly assigned partitions [mip-events-2-0] (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:341) 
INFO stream-thread [kafka-streams-0401c29c-30a9-4969-93f9-5a83b3c834b4-StreamThread-1] State transition from PARTITIONS_REVOKED to PARTITIONS_ASSIGNED (org.apache.kafka.streams.processor.internals.StreamThread:346) 
INFO KafkaAvroSerializerConfig values: 
    schema.registry.url = [http://kafka-schema-registry:8081]
    auto.register.schemas = true
    max.schemas.per.subject = 1000
 (io.confluent.kafka.serializers.KafkaAvroSerializerConfig:175) 
INFO KafkaAvroDeserializerConfig values: 
    schema.registry.url = [http://kafka-schema-registry:8081]
    auto.register.schemas = true
    max.schemas.per.subject = 1000
    specific.avro.reader = true
 (io.confluent.kafka.serializers.KafkaAvroDeserializerConfig:175) 
INFO KafkaAvroSerializerConfig values: 
    schema.registry.url = [http://kafka-schema-registry:8081]
    auto.register.schemas = true
    max.schemas.per.subject = 1000
 (io.confluent.kafka.serializers.KafkaAvroSerializerConfig:175) 
INFO KafkaAvroDeserializerConfig values: 
    schema.registry.url = [http://kafka-schema-registry:8081]
    auto.register.schemas = true
    max.schemas.per.subject = 1000
    specific.avro.reader = true
 (io.confluent.kafka.serializers.KafkaAvroDeserializerConfig:175) 
INFO stream-thread [kafka-streams-0401c29c-30a9-4969-93f9-5a83b3c834b4-StreamThread-1] partition assignment took 10 ms.
    current active tasks: [0_0]
    current standby tasks: []
    previous active tasks: []
 (org.apache.kafka.streams.processor.internals.StreamThread:351) 
INFO stream-thread [kafka-streams-0401c29c-30a9-4969-93f9-5a83b3c834b4-StreamThread-1] State transition from PARTITIONS_ASSIGNED to RUNNING (org.apache.kafka.streams.processor.internals.StreamThread:346) 
INFO stream-client [kafka-streams-0401c29c-30a9-4969-93f9-5a83b3c834b4]State transition from REBALANCING to RUNNING (org.apache.kafka.streams.KafkaStreams:346) 

我可以在Kafka Streams上进行此加入吗? 请注意,如果我使用KTable读取主题并使用selectKey,则此方法有效 KStream转换密钥但我想避免重新分区。
或者,正确的方法是以另一种方式从数据库导入我的数据,以避免创建Avro对象和 如何使用debezium连接器和KafkaConnect与AvroConverter启用?

0 个答案:

没有答案