具有多个avro注册表URL的KafkaAvroSerializer

时间:2018-11-02 16:09:58

标签: apache-kafka avro apache-kafka-streams confluent-schema-registry rocksdb

我们有一个KafkaAvroSerde配置了多个avroregistry网址。有时,serde在尝试在1个url上注册模式时超时,但是由于它向流应用程序抛出了IO异常,因此流线程关闭了。从kafka流应用程序的角度来看,这种创建方法无法满足创建avro serdes时能够支持多个URL的目的,因为冒泡DSL api堆栈的运行时异常将关闭流线程。 几个问题:

  1. 是否有解决这个问题的好方法?
  2. 我们是否需要在应用程序逻辑中强制重试(当您仅将主题具体化到商店时,这可能会很棘手)?
  3. 否则,有一个avroserde包装器
    可以重试实际的配置avroRegistry网址吗?
  4. 实体化到本地rocksDB商店时,是否添加了
    值以在注册表中注册架构,还是我们应该将auto.register.schemas配置为false?

>

Exception in thread "mediafirst-npvr-adapter-program-mapping-mtrl02nsbe02.pf.spop.ca-f5e097bd-ff1b-42da-9f7d-2ab9fa5d2b70-GlobalStreamThread" org.apache.kafka.common.errors.SerializationException: Error registering Avro schema: {"type":"record","name":"ProgramMapp
Caused by: io.confluent.kafka.schemaregistry.client.rest.exceptions.RestClientException: Register operation timed out; error code: 50002; error code: 50002
at io.confluent.kafka.schemaregistry.client.rest.RestService.sendHttpRequest(RestService.java:191)
at io.confluent.kafka.schemaregistry.client.rest.RestService.httpRequest(RestService.java:218)
at io.confluent.kafka.schemaregistry.client.rest.RestService.registerSchema(RestService.java:307)
at io.confluent.kafka.schemaregistry.client.rest.RestService.registerSchema(RestService.java:299)
at io.confluent.kafka.schemaregistry.client.rest.RestService.registerSchema(RestService.java:294)
at io.confluent.kafka.schemaregistry.client.CachedSchemaRegistryClient.registerAndGetId(CachedSchemaRegistryClient.java:61)
at io.confluent.kafka.schemaregistry.client.CachedSchemaRegistryClient.register(CachedSchemaRegistryClient.java:100)
at io.confluent.kafka.serializers.AbstractKafkaAvroSerializer.serializeImpl(AbstractKafkaAvroSerializer.java:79)
at io.confluent.kafka.serializers.KafkaAvroSerializer.serialize(KafkaAvroSerializer.java:53)
at io.confluent.kafka.streams.serdes.avro.SpecificAvroSerializer.serialize(SpecificAvroSerializer.java:65)
at io.confluent.kafka.streams.serdes.avro.SpecificAvroSerializer.serialize(SpecificAvroSerializer.java:38)
at org.apache.kafka.streams.state.StateSerdes.rawValue(StateSerdes.java:178)
at org.apache.kafka.streams.state.internals.MeteredKeyValueBytesStore$1.innerValue(MeteredKeyValueBytesStore.java:68)
at org.apache.kafka.streams.state.internals.MeteredKeyValueBytesStore$1.innerValue(MeteredKeyValueBytesStore.java:57)
at org.apache.kafka.streams.state.internals.InnerMeteredKeyValueStore.put(InnerMeteredKeyValueStore.java:199)
at org.apache.kafka.streams.state.internals.MeteredKeyValueBytesStore.put(MeteredKeyValueBytesStore.java:121)
at com.bell.cts.commons.kafka.store.custom.CustomStoreProcessor.process(CustomStoreProcessor.java:37)
at org.apache.kafka.streams.processor.internals.ProcessorNode$1.run(ProcessorNode.java:46)
at org.apache.kafka.streams.processor.internals.StreamsMetricsImpl.measureLatencyNs(StreamsMetricsImpl.java:208)
at org.apache.kafka.streams.processor.internals.ProcessorNode.process(ProcessorNode.java:124)
at org.apache.kafka.streams.processor.internals.GlobalProcessorContextImpl.forward(GlobalProcessorContextImpl.java:52)
at org.apache.kafka.streams.processor.internals.SourceNode.process(SourceNode.java:80)
at org.apache.kafka.streams.processor.internals.GlobalStateUpdateTask.update(GlobalStateUpdateTask.java:87)
at org.apache.kafka.streams.processor.internals.GlobalStreamThread$StateConsumer.pollAndUpdate(GlobalStreamThread.java:239)
at org.apache.kafka.streams.processor.internals.GlobalStreamThread.run(GlobalStreamThread.java:282)

1 个答案:

答案 0 :(得分:0)

  

从kafka流应用程序的角度来看,这种创建功能无法满足创建avro serdes时能够支持多个URL的目的,因为运行时异常使DSL api堆栈冒泡将关闭流线程。

我在这里不同意:从Kafka Streams的角度来看,序列化失败,因此该应用程序确实需要关闭。请注意,Kafka Streams与您使用的Serdes无关,因此不知道您的Serde正在与架构注册表通信,并且可以重试。

因此,Serde负责在内部处理重试。我不知道这样做的包装器,但是构建自己应该不难。我将创建一个内部票证以跟踪此功能请求。我认为将其添加为开箱即用的体验很有意义。

对于RocksDB:写入RocksDB的所有记录也都写入changelog主题中。因此,要允许Kafka Streams读取此数据以在发生错误后恢复状态,您需要注册模式。