spark-streaming-kafka-0-10_2.10中的一些kafka参数固定为无

时间:2017-10-10 12:08:36

标签: spark-streaming spark-streaming-kafka

我正在使用版本2.0.2的spark-streaming-kafka-0-10_2.10进行火花流工作。我收到这样的警告: 17/10/10 16:42:25 WARN KafkaUtils: overriding enable.auto.commit to false for executor 17/10/10 16:42:25 WARN KafkaUtils: overriding auto.offset.reset to none for executor 17/10/10 16:42:25 WARN KafkaUtils: overriding executor group.id to spark-executor-dump_user_profile 17/10/10 16:42:25 WARN KafkaUtils: overriding receive.buffer.bytes to 65536 see KAFKA-3135

当我查看源代码时,有一段代码修复了KafkaUtils中名为fixKafkaParams(...)的params,如下所示:

```

logWarning(s"overriding ${ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG} to false for executor")
kafkaParams.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, false: java.lang.Boolean)

logWarning(s"overriding ${ConsumerConfig.AUTO_OFFSET_RESET_CONFIG} to none for executor")
kafkaParams.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "none")

// driver and executor should be in different consumer groups
val originalGroupId = kafkaParams.get(ConsumerConfig.GROUP_ID_CONFIG)
if (null == originalGroupId) {
  logError(s"${ConsumerConfig.GROUP_ID_CONFIG} is null, you should probably set it")
}
val groupId = "spark-executor-" + originalGroupId
logWarning(s"overriding executor ${ConsumerConfig.GROUP_ID_CONFIG} to ${groupId}")
kafkaParams.put(ConsumerConfig.GROUP_ID_CONFIG, groupId)

// possible workaround for KAFKA-3135
val rbb = kafkaParams.get(ConsumerConfig.RECEIVE_BUFFER_CONFIG)
if (null == rbb || rbb.asInstanceOf[java.lang.Integer] < 65536) {
  logWarning(s"overriding ${ConsumerConfig.RECEIVE_BUFFER_CONFIG} to 65536 see KAFKA-3135")
  kafkaParams.put(ConsumerConfig.RECEIVE_BUFFER_CONFIG, 65536: java.lang.Integer)
}

} ``` 我怎么能通过这个?非常感谢

1 个答案:

答案 0 :(得分:0)

“ KafkaUtils:对于执行程序将auto.offset.reset覆盖为无”是KafkaUtils的常见行为,这不会产生任何问题。这可以忽略。

仅在KafkaUtils中以这种方式编写此代码,以便调整kafka参数以防止执行程序出现问题,但是您可以在驱动程序上进行检查,它不会更改auto.offset.reset的值并保留您定义的值在您的kafkaParams中。以下是kafkaUtils的链接以供参考。

https://github.com/apache/spark/blob/master/external/kafka-0-10/src/main/scala/org/apache/spark/streaming/kafka010/KafkaUtils.scala

最初,我什至以为这可能是个问题,但是执行我的kafka代码后,我没有遇到任何问题。