使用春季云流的可靠的异步批处理卡夫卡生产者

时间:2020-03-20 11:36:17

标签: java spring-cloud-stream spring-cloud-stream-binder-kafka

我想使用春季云流设置一些转换(从一个主题消费,产生到另一个主题)。我也想成为

  • 可靠-让我们说“至少一次”
  • 快速生产-每批而不是每条消息执行一次producer.flush()
  • 快速偏移提交-从分区中每批执行一次提交

使用原始的kafka客户端,或使用spring-kafka,我将执行下一个操作(假设生产者配置正确:acks = all,linger.ms> 0,批处理大小足够大,依此类推):

  1. 获取已消费的消息(例如,来自Consumer.poll()说)
  2. 对每封邮件进行我的自定义转换
  3. 每条消息
  4. producer.send(message).addCallback(...)
  5. producer.flush()
  6. 检查发送回调中是否没有错误
  7. consumer.commit()-最高偏移量

问题是如何与spring-cloud-stream实现相似?我知道Consumer.batch-mode = true和producer.sync = false,但是我不确定用偏移提交将可靠的产品绑定在一起的正确方法是什么

更新 我想出了一个简单的解决方案(请注意,除了最初的要求之外,我还需要具有动态主题创建功能的动态路由):

配置

public class DynamicRouterCfg {

    @Autowired
    private BinderAwareChannelResolver outputResolver;

    @Autowired
    private DynamicRouterProducerListener dynamicRouterProducerListener;

    private final Lock lock = new ReentrantLock();

    @Bean
    public Consumer<Message<List<byte[]>>> dynamicRouter() {
        return msgs -> {
            lock.lock();
            try {
                dynamicRouterProducerListener.setBatchSize(msgs.getPayload().size());

                for (byte[] payload : msgs.getPayload()) {
                    doBusinessLogic(payload);
                    outputResolver.resolveDestination(resolveDestination(payload))
                            .send(MessageBuilder.withPayload(payload)
                            .build());
                }

                if (dynamicRouterProducerListener.waitForBatchSuccess()) {
                    Acknowledgment ack = (Acknowledgment) msgs.getHeaders().get(KafkaHeaders.ACKNOWLEDGMENT);
                    Objects.requireNonNull(ack).acknowledge();
                }

            } finally {
                lock.unlock();
            }
        };
    }

    private void doBusinessLogic(byte[] payload) {
        // placeholder for business transformation
    }

    private String resolveDestination(byte[] payload) {
        // some business logic for destination resolving
        return null;
    }

    private static class DynamicRouterProducerListener implements ProducerListener<Object, Object> {

        private volatile CountDownLatch batchLatch;

        public void setBatchSize(int batchSize) {
            this.batchLatch = new CountDownLatch(batchSize);
        }

        public boolean waitForBatchSuccess() {
            try {
                return batchLatch.await(10000, TimeUnit.MILLISECONDS);
            } catch (InterruptedException e) {
                return false;
            }
        }

        @Override
        public void onSuccess(ProducerRecord<Object, Object> producerRecord, RecordMetadata recordMetadata) {
            batchLatch.countDown();
        }

    }
}

application.yml

spring:
  cloud:
    function:
      definition: dynamicRouter
    stream:
      bindings:
        dynamicRouter-in-0:
          destination: consumed.topic
          group: test.group
          consumer:
            batch-mode: true
            concurrency: 1
            header-mode: none
            use-native-decoding: true
      kafka:
        binder:
          brokers: broker:9092
          auto-create-topics: true
          required-acks: all
        bindings:
          router-in-0:
            consumer:
              auto-rebalance-enabled: false
              auto-commit-offset: false
              auto-commit-on-error: false
              configuration.fetch.max.bytes: 5024000
              configuration.isolation.level: read_committed
        default:
          producer:
            sync: false
            batch-timeout: 10 # as i see this one would be converted to linger.ms
            compression: gzip
            configuration:
              max.in.flight.requests.per.connection: 1
              max.request.size: 5024000
            # i need to create topics with custom configuration
            topic.replication-factor: 2
            topic.properties:
              cleanup.policy: compact
              min.cleanable.dirty.ratio: 0.1

我可以在这里看到哪些缺点:

  1. 使用已弃用的BinderAwareChannelResolver:不知道我该如何使用spring.cloud.stream.sendto.destination
  2. 每批之后,它将等待多余的$ {batch-timeout}
  3. 不确定producer.onSuccess方法的保证(如果可以进行“重复的”成功调用),尤其是当来自单个传入批处理的消息被路由到不同的主题时

0 个答案:

没有答案