在我的应用程序中使用Kafka和Spring引导kafka流侦听器。配置为: 160个分区,HPA设置为6-40,因此在峰值负载下,Pod扩展到40。问题是,单个偏移量上的单个消息被消耗了多次,有时多达15次,这导致DB上出现瓶颈有时还有其他问题。
log.info("Acknowledging message offset : {} for correlation ID {} time taken {} groupId {}", offset, message.getPayload().getCorrelationId(), (System.currentTimeMillis() - startTime), groupId);
多个Pod提交了相同的相关ID和相同的偏移量。我还打印了组ID,以确保所有Pod都位于同一个消费者组中,其中的一些日志如下:
03-10-2020 15:39:23.966 [35m[KafkaConsumerDestination{consumerDestinationName='topic-input', partitions=0, dlqName='null'}.container-4-C-1][0;39m [34mINFO [0;39m c.c.g.m.m.c.MessageConsumerImpl.processCreateWorkflow - Acknowledging message offset : 118347 for correlation ID f5979128-f69c-4b32-b510-6972ef1cadff time taken 2994 groupId topic
==========
03-10-2020 15:39:23.947 [35m[KafkaConsumerDestination{consumerDestinationName='topic-input', partitions=0, dlqName='null'}.container-5-C-1][0;39m [34mINFO [0;39m c.c.g.m.m.c.MessageConsumerImpl.processCreateWorkflow - Acknowledging message offset : 119696 for correlation ID f5979128-f69c-4b32-b510-6972ef1cadff time taken 1434 groupId topic
这2个日志的Pod ID为:5f812fe2-38d4-430e-a395-943d77747b3a 和194786f6-83df-4288-959c-7fcfb42de45a。(它们是不同的)
另一个奇怪的事情是它们都使用同一分区中的消息,即分区0,尽管我们尝试查看其他分区中的消息,但没有打印任何其他分区的日志。 下面是spring boot kafka流监听器的配置:
stream:
kafka:
binder:
brokers: brkr:9092
zkNodes: zk1:2181,zk2:2181,zk3:2181
autoCreateTopics: false
# retry configs
producerProperties:
key.serializer: org.apache.kafka.common.serialization.StringSerializer
value.serializer: org.apache.kafka.common.serialization.ByteArraySerializer
retries: 3
max.in.flight.requests.per.connection: 1
retry.backoff.ms: 9000
request.timeout.ms: 400000
delivery.timeout.ms: 450000
applicationId: ng-member-service
consumerProperties:
key.deserializer: org.apache.kafka.common.serialization.StringDeserializer
value.deserializer: org.apache.kafka.common.serialization.ByteArrayDeserializer
#max.poll.records: 100
auto.offset.reset: earliest
session.timeout.ms: 300000
request.timeout.ms: 400000
allow.auto.create.topics: false
heartbeat.interval.ms: 80000
default:
consumer:
autoCommitOffset: false
producer:
messageKeyExpression: payload.entityId
bindings:
topic-input:
consumer:
configuration:
max.poll.records: 350
default:
consumer:
partitioned: true
concurrency: 4
producer:
partitionKeyExpression: payload.entityId
#partitionCount: 4
bindings:
topic-input:
destination: topic-input
group: topic
consumer:
maxAttempts: 1
partitioned: true
concurrency: 12
任何上述帮助将不胜感激。