我想知道是否有人遇到过此事(可能非常简单)。我有一个设置,我将每个服务器设置为:
1)任何时候每个消费者群体(例如消费者群体“cg1”,只有1个消费者)有1个消费者。
2)1消费者群体可以订阅多个主题
3)如果服务器出现故障,重新启动后,使用者组将重新加入并再次订阅该主题。
目前我正在使用消息的方式是(不发布整个代码):
1)我实例化一个新的消费者(设置为TimeTask的一部分每30秒执行一次)
2)我从Properties
文件加载所有消费者属性。
3)我每次都指定相同的group.id
。
4)我消费消息
5)我使用enable.auto.commit=false
.commitSync()
)
5)我关闭消费者。
我的主题有6个分区和3个复制因子。我想在每次完成抵消时关闭消费者,因为释放任何资源可能更安全。
似乎每次我的服务器重新启动时,我都会再次使用这些消息。我有auto.offset.reset=earliest
这意味着当没有有效的偏移量时,可以为消费者组检索详细信息,我将始终从最早的可用偏移量开始。但这不会影响已成功向群集提交偏移的任何现有使用者组。即使我重新加入,我也应该开始接收上次提交的抵消信息。
我对消费者群体或消费者的误解是否在auto.offset.reset
设置中扮演关键角色?或者我真的需要在这里做一些手册吗?我假设有更多的分区而不是实际的消费者 - 每组导致问题,但我很乐意了解更多。
**消费者代码示例**
public void process() {
logger.info("beginning of process()");
ConsumerRecords<byte[], byte[]> records = this.getConsumer().poll(KafkaConstants.KAFKA_POLL_TIME_MILLIS);
if (records != null && records.count() > 0) {
// Prescribed by Kafka API to have finer control over offsets
for (TopicPartition partition : records.partitions()) {
List<ConsumerRecord<byte[], byte[]>> partitionRecords = records.records(partition);
for (ConsumerRecord<byte[], byte[]> record : partitionRecords) {
try {
logger.info("beginning of processEachRecord()");
this.processEachRecord(record);
} catch (Exception e) {
logger.info("Exception whilst processing messages");
logger.error(e);
logger.info("Closing consumer after exception in processing");
this.getConsumer().close();
return;
}
try {
long lastOffset = partitionRecords.get(partitionRecords.size() - 1).offset();
consumer.commitSync(Collections.singletonMap(partition, new OffsetAndMetadata(lastOffset+1)));
} catch (CommitFailedException cfe) {
logger.info("Commit Failed------");
logger.error(cfe);
logger.info("Closing consumer after failed commit");
this.getConsumer().close();
return;
}
}
}
}
logger.info("Total records=" + records.count());
logger.info("Closing consumer");
this.getConsumer().close();
}
消费者配置:
key.deserializer=org.apache.kafka.common.serialization.ByteArrayDeserializer
value.deserializer=org.apache.kafka.common.serialization.ByteArrayDeserializer
group.id=samegroup1
enable.auto.commit=false
auto.offset.reset=earliest
fetch.min.bytes=10
bootstrap.servers=localhost:9092,localhost:9093,localhost:9094
感谢Vahid H和Hans J评论 - this SO answer也解释了我可能遇到的问题。
broker.id=2
listeners=PLAINTEXT://localhost:9094
listener.security.protocol.map=PLAINTEXT:PLAINTEXT,SSL:SSL
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=999999999
log.dirs=/tmp/kafka-logs-3
num.partitions=1
num.recovery.threads.per.data.dir=1
log.retention.hours=2
log.retention.minutes=45
log.retention.bytes=20971520
log.segment.bytes=10485760
log.roll.hours=1
log.retention.check.interval.ms=300000
offsets.retention.minutes=20
offsets.retention.check.interval.ms=300000
log.cleanup.policy=compact,delete
zookeeper.connect=localhost:2181,localhost:2182,localhost:2183
zookeeper.connection.timeout.ms=30000
compression.type=gzip
delete.topic.enable=true
kafka.metrics.polling.interval.secs=5
kafka.metrics.reporters=kafka.metrics.KafkaCSVMetricsReporter
kafka.csv.metrics.dir=/tmp/kafka_metrics
kafka.csv.metrics.reporter.enabled=false
这对我来说有点重要,因为auto.offset.reset=latest
和log.retention.minutes
以及offsets.retention.minutes
我应该能够防止重复(“完全一次”机制)。但是KAFKA-1194引起了以下
kafka.common.KafkaStorageException: Failed to change the log file suffix from to .deleted for log segment 2
at kafka.log.LogSegment.kafkaStorageException$1(LogSegment.scala:340)
at kafka.log.LogSegment.changeFileSuffixes(LogSegment.scala:342)
at kafka.log.Log.kafka$log$Log$$asyncDeleteSegment(Log.scala:981)
at kafka.log.Log.kafka$log$Log$$deleteSegment(Log.scala:971)
at kafka.log.Log$$anonfun$deleteOldSegments$1.apply(Log.scala:673)
at kafka.log.Log$$anonfun$deleteOldSegments$1.apply(Log.scala:673)
at scala.collection.immutable.List.foreach(List.scala:318)
at kafka.log.Log.deleteOldSegments(Log.scala:673)
at kafka.log.Log.deleteRetenionMsBreachedSegments(Log.scala:703)
at kafka.log.Log.deleteOldSegments(Log.scala:697)
at kafka.log.LogManager$$anonfun$cleanupLogs$3.apply(LogManager.scala:474)
at kafka.log.LogManager$$anonfun$cleanupLogs$3.apply(LogManager.scala:472)
at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)
at kafka.log.LogManager.cleanupLogs(LogManager.scala:472)
at kafka.log.LogManager$$anonfun$startup$1.apply$mcV$sp(LogManager.scala:200)
at kafka.utils.KafkaScheduler$$anonfun$1.apply$mcV$sp(KafkaScheduler.scala:110)
at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:57)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.nio.file.FileSystemException: \kafka2\40-0\00000000000000000002.log -> \kafka2\40-0\00000000000000000002.log.deleted: The process cannot access the file because it is being used by another process.
at sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:86)
at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:97)
at sun.nio.fs.WindowsFileCopy.move(WindowsFileCopy.java:387)
at sun.nio.fs.WindowsFileSystemProvider.move(WindowsFileSystemProvider.java:287)
at java.nio.file.Files.move(Files.java:1395)
at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:711)
at org.apache.kafka.common.record.FileRecords.renameTo(FileRecords.java:210)
... 28 more
Suppressed: java.nio.file.FileSystemException: \kafka2\40-0\00000000000000000002.log -> \kafka2\40-0\00000000000000000002.log.deleted: The process cannot access the file because it is being used by another process.
at sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:86)
at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:97)
at sun.nio.fs.WindowsFileCopy.move(WindowsFileCopy.java:301)
at
这就是我必须在很长一段时间内保留补偿和主题的原因。如果有人对如何在没有日志清理的情况下维护非重复消息消费提出了更好的建议,那将会很有帮助。