Spring Kafka,Spring Cloud Stream和Avro兼容性未知魔术字节

时间:2019-01-30 20:06:04

标签: java apache-kafka avro spring-cloud-stream confluent-schema-registry

我在反序列化来自Kafka主题的消息时遇到问题。消息已使用spring-cloud-stream和Apache Avro进行了序列化。我正在使用Spring Kafka阅读它们,并试图对其进行反序列化。如果我使用spring-cloud来产生和使用消息,那么可以很好地反序列化消息。问题是当我在Spring Kafka中使用它们,然后尝试反序列化时。

我正在使用架构注册表(用于开发的Spring-boot架构注册表,以及生产中的Confluent架构),但是反序列化问题似乎在事件调用架构注册表之前发生。

很难在此问题上发布所有相关代码,因此我已将其发布在git hub的仓库中:https://github.com/robjwilkins/avro-example

我通过主题发送的对象只是一个简单的pojo:

@Data
public class Request {
  private String message;
}

在Kafka上生成消息的代码如下:

@EnableBinding(MessageChannels.class)
@Slf4j
@RequiredArgsConstructor
@RestController
public class ProducerController {

  private final MessageChannels messageChannels;

  @GetMapping("/produce")
  public void produceMessage() {
    Request request = new Request();
    request.setMessage("hello world");
    Message<Request> requestMessage = MessageBuilder.withPayload(request).build();
    log.debug("sending message");
    messageChannels.testRequest().send(requestMessage);
  }
}

和application.yaml:

spring:
  application.name: avro-producer
  kafka:
    bootstrap-servers: localhost:9092
    consumer.group-id: avro-producer
  cloud:
    stream:
      schema-registry-client.endpoint: http://localhost:8071
      schema.avro.dynamic-schema-generation-enabled: true
      kafka:
        binder:
          brokers: ${spring.kafka.bootstrap-servers}
      bindings:
        test-request:
          destination: test-request
          contentType: application/*+avro

然后我有一个消费者:

@Slf4j
@Component
public class TopicListener {

    @KafkaListener(topics = {"test-request"})
    public void listenForMessage(ConsumerRecord<String, Request> consumerRecord) {
        log.info("listenForMessage. got a message: {}", consumerRecord);
        consumerRecord.headers().forEach(header -> log.info("header. key: {}, value: {}", header.key(), asString(header.value())));
    }

    private String asString(byte[] byteArray) {
        return new String(byteArray, Charset.defaultCharset());
    }
}

消耗的项目具有application.yaml配置:

spring:
  application.name: avro-consumer
  kafka:
    bootstrap-servers: localhost:9092
    consumer:
      group-id: avro-consumer
      value-deserializer: io.confluent.kafka.serializers.KafkaAvroDeserializer
#      value-deserializer: org.apache.kafka.common.serialization.StringDeserializer
      key-deserializer: org.apache.kafka.common.serialization.StringDeserializer
      properties:
        schema.registry.url: http://localhost:8071

当消费者收到一条消息时,它会导致异常:

2019-01-30 20:01:39.900 ERROR 30876 --- [ntainer#0-0-C-1] o.s.kafka.listener.LoggingErrorHandler   : Error while processing: null

org.apache.kafka.common.errors.SerializationException: Error deserializing key/value for partition test-request-0 at offset 43. If needed, please seek past the record to continue consumption.
Caused by: org.apache.kafka.common.errors.SerializationException: Error deserializing Avro message for id -1
Caused by: org.apache.kafka.common.errors.SerializationException: Unknown magic byte!

我已逐步完成反序列化代码,直到引发该异常为止

public abstract class AbstractKafkaAvroDeserializer extends AbstractKafkaAvroSerDe {
....
private ByteBuffer getByteBuffer(byte[] payload) {
  ByteBuffer buffer = ByteBuffer.wrap(payload);
  if (buffer.get() != 0) {
    throw new SerializationException("Unknown magic byte!");
  } else {
    return buffer;
  }
}

之所以发生这种情况,是因为解串器检查序列化对象(字节数组)的字节内容,并期望它为0,但事实并非如此。因此,我质疑对对象进行序列化的spring-cloud-stream MessageConverter是否与我用来反序列化该对象的io.confluent对象兼容。如果它们不兼容,该怎么办?

感谢您的帮助。

4 个答案:

答案 0 :(得分:1)

这个问题的症结在于生产者正在使用spring-cloud-stream向Kafka发送消息,而消费者则使用spring-kaka。原因如下:

  • 现有系统已经非常完善,并使用spring-cloud-stream
  • 要求新用户使用相同的方法收听多个主题,仅绑定到主题名称的csv列表上
  • 有一个需求,即一次而不是单独使用一组消息,以便将它们的内容批量写入数据库。

Spring-cloud-stream当前不允许消费者将侦听器绑定到多个主题,并且无法一次使用一组消息(除非我弄错了)。

我找到了一个不需要对生产者代码进行任何更改的解决方案,该生产者代码使用spring-cloud-stream将消息发布到Kafka。 Spring-cloud-stream使用MessageConverter来管理序列化和反序列化。在AbstractAvroMessageConverter中,有方法:convertFromInternalconvertToInternal处理到/从字节数组的转换。我的解决方案是扩展此代码(创建一个扩展AvroSchemaRegistryClientMessageConverter的类),以便可以重用许多spring-cloud-stream-stream功能,但要使用一个可以从spring-kafka {{1 }}。然后,我修改了TopicListener以使用此类进行转换:

转换器:

KafkaListener

经修订的@Component @Slf4j public class AvroKafkaMessageConverter extends AvroSchemaRegistryClientMessageConverter { public AvroKafkaMessageConverter(SchemaRegistryClient schemaRegistryClient) { super(schemaRegistryClient, new NoOpCacheManager()); } public <T> T convertFromInternal(ConsumerRecord<?, ?> consumerRecord, Class<T> targetClass, Object conversionHint) { T result; try { byte[] payload = (byte[]) consumerRecord.value(); Map<String, String> headers = new HashMap<>(); consumerRecord.headers().forEach(header -> headers.put(header.key(), asString(header.value()))); MimeType mimeType = messageMimeType(conversionHint, headers); if (mimeType == null) { return null; } Schema writerSchema = resolveWriterSchemaForDeserialization(mimeType); Schema readerSchema = resolveReaderSchemaForDeserialization(targetClass); @SuppressWarnings("unchecked") DatumReader<Object> reader = getDatumReader((Class<Object>) targetClass, readerSchema, writerSchema); Decoder decoder = DecoderFactory.get().binaryDecoder(payload, null); result = (T) reader.read(null, decoder); } catch (IOException e) { throw new RuntimeException("Failed to read payload", e); } return result; } private MimeType messageMimeType(Object conversionHint, Map<String, String> headers) { MimeType mimeType; try { String contentType = headers.get(MessageHeaders.CONTENT_TYPE); log.debug("contentType: {}", contentType); mimeType = MimeType.valueOf(contentType); } catch (InvalidMimeTypeException e) { log.error("Exception getting object MimeType from contentType header", e); if (conversionHint instanceof MimeType) { mimeType = (MimeType) conversionHint; } else { return null; } } return mimeType; } private String asString(byte[] byteArray) { String theString = new String(byteArray, Charset.defaultCharset()); return theString.replace("\"", ""); } }

TopicListener

此解决方案一次只消耗一条消息,但可以轻松修改以消耗一批消息。

完整的解决方案在这里:https://github.com/robjwilkins/avro-example/tree/develop

答案 1 :(得分:0)

您应该通过在配置中创建DefaultKafkaConsumerFactoryTopicListener bean来显式定义反序列化器,如下所示:

@Configuration
@EnableKafka
public class TopicListenerConfig {

@Value("${spring.kafka.bootstrap-servers}")
private String bootstrapServers;

@Value(("${spring.kafka.consumer.group-id}"))
private String groupId;


@Bean
public Map<String, Object> consumerConfigs() {
    Map<String, Object> props = new HashMap<>();
    props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
    props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
    props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, JsonDeserializer.class);
    props.put(ConsumerConfig.GROUP_ID_CONFIG, groupId);
    props.put(JsonDeserializer.TRUSTED_PACKAGES, "com.wilkins.avro.consumer");
    props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");

    return props;
}

@Bean
public ConsumerFactory<String, String> consumerFactory() {
    return new DefaultKafkaConsumerFactory<>(consumerConfigs());
}

@Bean
public KafkaListenerContainerFactory<ConcurrentMessageListenerContainer<String, String>> kafkaListenerContainerFactory() {
    ConcurrentKafkaListenerContainerFactory<String, String> factory =
            new ConcurrentKafkaListenerContainerFactory<>();
    factory.setConsumerFactory(consumerFactory());

    return factory;
}

@Bean
public TopicListener topicListener() {
    return new TopicListener();
}
}

答案 2 :(得分:0)

您可以将绑定配置为本地使用Kafka序列化程序。

将生产者属性useNativeEncoding设置为true,并使用...producer.configuration Kafka属性配置序列化程序。

编辑

示例:

spring:
  cloud:
    stream:
# Generic binding properties
      bindings:
        input:
          consumer:
            use-native-decoding: true
          destination: so54448732
          group: so54448732
        output:
          destination: so54448732
          producer:
            use-native-encoding: true
# Kafka-specific binding properties
      kafka:
        bindings:
          input:
            consumer:
              configuration:
                value.deserializer: com.example.FooDeserializer
          output:
            producer:
              configuration:
                value.serializer: com.example.FooSerializer

答案 3 :(得分:0)

感谢使用原生编码和spring节省了我的时间: 云: 流:

通用绑定属性

  bindings:
    input:
      consumer:
        use-native-decoding: true
      destination: so54448732
      group: so54448732
    output:
      destination: so54448732
      producer:
        use-native-encoding: true

Kafka特定的绑定属性

  kafka:
    bindings:
      input:
        consumer:
          configuration:
            value.deserializer: com.example.FooDeserializer
      output:
        producer:
          configuration:
            value.serializer: com.example.FooSerializer