春季云流卡夫卡-获取批量并错过心跳

时间:2019-02-28 18:55:54

标签: apache-kafka spring-cloud spring-cloud-stream

我正在寻找一个弹簧启动服务,该服务从apache kafka读取消息,通过http从另一个服务请求消息指示的记录,对其进行处理,将一些数据保存到数据库中并将结果发布到另一个主题。

这是通过

完成的
@StreamListener(Some.INPUT)
@SendTo(Some.OUTPUT)

这是在几种服务中完成的,通常效果很好。唯一的属性集是

spring.cloud.stream.binder.consumer.concurrency=20

主题本身有20个分区,应该适合。

在监视来自kafka的读取时,我们发现吞吐量确实很低,并且行为异常:

该应用程序一次最多读取500条消息,随后1-2分钟全无。在这段时间中,使用者反复记录其“缺少心跳,因为该分区已重新平衡”,“重新分配分区”,有时甚至抛出异常,称其“由于轮询间隔已过去而未能提交”。

我们得出的结论是,这意味着消费者获取500条消息,处理所有消息花费的时间很长,错过了其时间窗口,因此无法将500条消息中的任何一条提交给代理-这将重新分配分区并重新发送消息。同样的消息。

浏览线程和文档后,我发现了“ max.poll.records”属性,但在设置该属性的位置存在冲突。

有人说要设置为

spring.cloud.stream.bindings.consumer.<input>.configuration

有人说

spring.cloud.stream.kafka.binders.consumer-properties

我尝试将两者都设置为1,但是服务行为没有改变。

我如何正确处理消费者无法使用默认设置跟上所需的轮询间隔的情况?

common-yaml:

spring.cloud.stream.default.group=${spring.application.name}

service-yaml

spring:
  clould:
    stream:
      default:
        consumer.headerMode: embeddedHeaders
        producer.headerMode: embeddedHeaders
      bindings:
       someOutput:
         destination: outTopic
       someInput:
         destination: inTopic
           consumer:
             concurrency: 30
      kafka:
        bindings:
          consumer:
            someInput:
              configuarion:
                max.poll.records: 20 # ConsumerConfig ignores this
              consumer:
                enableDlq: true
                configuarion:
                  max.poll.records: 30 # ConsumerConfig ignores this
          someInput:
            configuarion:
              max.poll.records: 20 # ConsumerConfig ignores this
            consumer:
              enableDlq: true
              configuarion:
                max.poll.records: 30 # ConsumerConfig ignores this
        binder:
          consumer-properties:
            max.poll.records: 10 # this gets used first
          configuration:
            max.poll.records: 40 # this get used when the first one is not present

“忽略此内容”始终意味着,如果未设置其他属性,ConsumerConfiguration将最大轮询记录的默认值保持为500

编辑。:我们已经接近:

问题与设置了exponentialBackoffStrategy的spring retry有关-以及大量错误有效地停止了应用程序。

我没有得到的是,我们通过向有问题的主题发布格式错误的消息来强制执行200个错误,这导致该应用读取200,花了很多时间(使用旧的重试配置),然后一次提交了所有200个错误。

如果我们有

,这有什么意义?
max.poll.records: 1
concurrency: 1
ackEachRecod = true
enableDlq: true # (which implicitly makes autoCommitOffsets = true according to the docs)

1 个答案:

答案 0 :(得分:0)

spring.cloud.stream.kafka.bindings.consumer.<input>.consumer.configuration.max.poll.records
.

请参见the documentation ...

  

Kafka消费者财产

     

以下属性仅适用于Kafka使用者,并且必须以spring.cloud.stream.kafka.bindings.<channelName>.consumer.作为前缀

     

...

     

配置

     

使用包含通用Kafka使用者属性的键/值对映射。

     

默认:空白地图。

     

...

您还可以增加max.poll.interval.ms

编辑

我刚刚使用2.1.0.RELEASE进行了测试-它按照我的描述进行工作:

无设置

2019-03-01 08:47:59.560  INFO 44698 --- [           main] o.a.k.clients.consumer.ConsumerConfig    : ConsumerConfig values: 
    ...
    max.poll.records = 500
    ...

默认启动

spring.kafka.consumer.properties.max.poll.records=42

2019-03-01 08:49:49.197  INFO 45044 --- [           main] o.a.k.clients.consumer.ConsumerConfig    : ConsumerConfig values: 
    ...
    max.poll.records = 42
    ...

活页夹默认值#1

spring.kafka.consumer.properties.max.poll.records=42
spring.cloud.stream.kafka.binder.consumer-properties.max.poll.records=43

2019-03-01 08:52:11.469  INFO 45842 --- [           main] o.a.k.clients.consumer.ConsumerConfig    : ConsumerConfig values: 
    ...
    max.poll.records = 43
    ...

活页夹默认设置#2

spring.kafka.consumer.properties.max.poll.records=42
spring.cloud.stream.kafka.binder.configuration.max.poll.records=43

2019-03-01 08:54:06.211  INFO 46252 --- [           main] o.a.k.clients.consumer.ConsumerConfig    : ConsumerConfig values: 
    ...
    max.poll.records = 43
    ...

默认绑定

spring.kafka.consumer.properties.max.poll.records=42
spring.cloud.stream.kafka.binder.configuration.max.poll.records=43
spring.cloud.stream.kafka.default.consumer.configuration.max.poll.records=44

2019-03-01 09:02:26.004  INFO 47833 --- [           main] o.a.k.clients.consumer.ConsumerConfig    : ConsumerConfig values: 
    ...
    max.poll.records = 44
    ...

特定于绑定

spring.kafka.consumer.properties.max.poll.records=42
spring.cloud.stream.kafka.binder.configuration.max.poll.records=43
spring.cloud.stream.kafka.default.consumer.configuration.max.poll.records=44
spring.cloud.stream.kafka.bindings.input.consumer.configuration.max.poll.records=45

2019-03-01 09:05:01.452  INFO 48330 --- [           main] o.a.k.clients.consumer.ConsumerConfig    : ConsumerConfig values: 
    ...
    max.poll.records = 45
    ...

EDIT2

这是完整的测试应用程序。我只是在http://start.spring.io创建了一个新应用,然后选择了“ Kafka”和“ Cloud Stream”。

@SpringBootApplication
@EnableBinding(Sink.class)
public class So54932453Application {

    public static void main(String[] args) {
        SpringApplication.run(So54932453Application.class, args).close();
    }

    @StreamListener(Sink.INPUT)
    public void listen(String in) {

    }

}

spring.cloud.stream.bindings.input.group=so54932453

spring.kafka.consumer.properties.max.poll.records=42
spring.cloud.stream.kafka.binder.configuration.max.poll.records=43
spring.cloud.stream.kafka.default.consumer.configuration.max.poll.records=44
spring.cloud.stream.kafka.bindings.input.consumer.configuration.max.poll.records=45

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>
    <parent>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-parent</artifactId>
        <version>2.1.3.RELEASE</version>
        <relativePath/> <!-- lookup parent from repository -->
    </parent>
    <groupId>net.gprussell</groupId>
    <artifactId>so54932453</artifactId>
    <version>0.0.1-SNAPSHOT</version>
    <name>so54932453</name>
    <description>Demo</description>

    <properties>
        <java.version>1.8</java.version>
        <spring-cloud.version>Greenwich.RELEASE</spring-cloud.version>
    </properties>

    <dependencies>
        <dependency>
            <groupId>org.springframework.cloud</groupId>
            <artifactId>spring-cloud-stream</artifactId>
        </dependency>
        <dependency>
            <groupId>org.springframework.cloud</groupId>
            <artifactId>spring-cloud-stream-binder-kafka</artifactId>
        </dependency>
        <dependency>
            <groupId>org.springframework.kafka</groupId>
            <artifactId>spring-kafka</artifactId>
        </dependency>

        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-test</artifactId>
            <scope>test</scope>
        </dependency>
        <dependency>
            <groupId>org.springframework.cloud</groupId>
            <artifactId>spring-cloud-stream-test-support</artifactId>
            <scope>test</scope>
        </dependency>
        <dependency>
            <groupId>org.springframework.kafka</groupId>
            <artifactId>spring-kafka-test</artifactId>
            <scope>test</scope>
        </dependency>
    </dependencies>

    <dependencyManagement>
        <dependencies>
            <dependency>
                <groupId>org.springframework.cloud</groupId>
                <artifactId>spring-cloud-dependencies</artifactId>
                <version>${spring-cloud.version}</version>
                <type>pom</type>
                <scope>import</scope>
            </dependency>
        </dependencies>
    </dependencyManagement>

    <build>
        <plugins>
            <plugin>
                <groupId>org.springframework.boot</groupId>
                <artifactId>spring-boot-maven-plugin</artifactId>
            </plugin>
        </plugins>
    </build>

    <repositories>
        <repository>
            <id>spring-milestones</id>
            <name>Spring Milestones</name>
            <url>https://repo.spring.io/milestone</url>
        </repository>
    </repositories>

</project>