如何在Spark Streaming中使用`ConsumerRebalanceListener`?

时间:2019-02-06 15:59:11

标签: scala apache-kafka spark-streaming

我正在使用spark-streaming-kafka-0-10_2.11spark-streaming_2.1。我使用以下代码来消耗卡夫卡:

import org.apache.kafka.clients.consumer.ConsumerRecord
import org.apache.kafka.common.serialization.StringDeserializer
import org.apache.spark.streaming.kafka010._
import org.apache.spark.streaming.kafka010.LocationStrategies.PreferConsistent
import org.apache.spark.streaming.kafka010.ConsumerStrategies.Subscribe

val kafkaParams = Map[String, Object](
  "bootstrap.servers" -> "localhost:9092,anotherhost:9092",
  "key.deserializer" -> classOf[StringDeserializer],
  "value.deserializer" -> classOf[StringDeserializer],
  "group.id" -> "use_a_separate_group_id_for_each_stream",
  "auto.offset.reset" -> "latest",
  "enable.auto.commit" -> (false: java.lang.Boolean)
)

val topics = Array("topicA", "topicB")
val stream = KafkaUtils.createDirectStream[String, String](
  streamingContext,
  PreferConsistent,
  Subscribe[String, String](topics, kafkaParams)
)

stream.foreachRDD { rdd =>
  val offsetRanges = rdd.asInstanceOf[HasOffsetRanges].offsetRanges

  // some time later, after outputs have completed
  stream.asInstanceOf[CanCommitOffsets].commitAsync(offsetRanges)
}

有时存在以下例外情况:

org.apache.kafka.clients.consumer.commitfailedexception: commit cannot be completed since the group has already rebalanced and assigned the partitions to another member. this means that the time between subsequent calls to poll() was longer than the configured session.timeout.ms, which typically implies that the poll loop is spending too much time message processing. you can address this either by increasing the session timeout or by reducing the maximum size of batches returned in poll() with max.poll.records.

因此,我想使用ConsumerRebalanceListener保存TopicPartition信息。我知道我们可以这样使用它:

    val consumer = new KafkaConsumer[String,String](props)

    val listener = new RebalanceListener

    consumer.subscribe(Collections.singletonList(topic), listener)

所以,我的问题是如何在DirectStream中使用侦听器?

0 个答案:

没有答案