我在flink
版本1.7
中使用flink-connector-kafka
0.11
。
Flink检查点已明确关闭,我依靠kafka每5秒自动提交一次偏移。我时不时地看到这个错误:
2019-07-16 08:32:04.273 [JobName] ERROR o.a.kafka.clients.consumer.internals.ConsumerCoordinator - [ConsumerName] Offset commit failed on partition topic-name-0 at offset 7591394545: The request timed out.
2019-07-16 08:32:04.273 [JobName] ERROR o.a.kafka.clients.consumer.internals.ConsumerCoordinator - [ConsumerName] Offset commit failed on partition topic-name-0 at offset 7591394545: The request timed out.
2019-07-16 08:32:04.310 [JobName] INFO o.a.kafka.clients.consumer.internals.AbstractCoordinator - [ConsumerName] Marking the coordinator servername:21000 (id: 2147482313 rack: null) dead
2019-07-16 08:32:04.310 [JobName] INFO o.a.kafka.clients.consumer.internals.AbstractCoordinator - [ConsumerName] Marking the coordinator servername:21000 (id: 2147482313 rack: null) dead
2019-07-16 08:32:04.322 [JobName] WARN o.a.kafka.clients.consumer.internals.ConsumerCoordinator - [ConsumerName] Asynchronous auto-commit of offsets {topic-name-0=OffsetAndMetadata{offset=7591394751, metadata=''}} failed: Offset commit failed with a retriable exception. You should retry committing the latest consumed offsets.
2019-07-16 08:32:04.322 [JobName] WARN o.a.kafka.clients.consumer.internals.ConsumerCoordinator - [ConsumerName] Asynchronous auto-commit of offsets {topic-name-0=OffsetAndMetadata{offset=7591394751, metadata=''}} failed: Offset commit failed with a retriable exception. You should retry committing the latest consumed offsets.
2019-07-16 08:32:04.337 [JobName] WARN o.a.kafka.clients.consumer.internals.ConsumerCoordinator - [ConsumerName] Asynchronous auto-commit of offsets {topic-name-0=OffsetAndMetadata{offset=7591394545, metadata=''}} failed: Offset commit failed with a retriable exception. You should retry committing the latest consumed offsets.
2019-07-16 08:32:04.337 [JobName] WARN o.a.kafka.clients.consumer.internals.ConsumerCoordinator - [ConsumerName] Asynchronous auto-commit of offsets {topic-name-0=OffsetAndMetadata{offset=7591394545, metadata=''}} failed: Offset commit failed with a retriable exception. You should retry committing the latest consumed offsets.
发生这种情况时,我检查了该组/主题/分区的kafka偏移量,可以看到它们不再自动提交,并且如果我必须重新启动作业,它将自事件发生后重播所有数据。
我有办法让flink重试偏移的提交吗?我会增加request.timeout.ms
参数,但已经是305'000 ms