我正在尝试使用spark2-shell从kafka用户读取数据。
请在下面找到我的代码。
我通过以下方式启动spark2-shell:
spark2-shell --jars kafka-clients-0.10.1.2.6.2.0-205.jar, spark-sql-kafka-0-10_2.11-2.1.1.jar
请找到我下面的代码:
import org.apache.kafka.clients.consumer.ConsumerRecord
import org.apache.kafka.common.serialization.StringDeserializer
import org.apache.spark.streaming._
import org.apache.spark.streaming.dstream.InputDStream
import org.apache.spark.streaming.kafka010.{ConsumerStrategies, KafkaUtils, LocationStrategies}
import spark.implicits._
val ssc = new StreamingContext(sc, Seconds(2))
val topics = List("testingtopic01")
val preferredHosts = LocationStrategies.PreferConsistent
val kafkaParams = Map(
"bootstrap.servers" -> "localhost:9192",
"key.deserializer" -> classOf[StringDeserializer],
"value.deserializer" -> classOf[StringDeserializer],
"security.protocol" -> "SASL_PLAINTEXT",
"auto.offset.reset" -> "earliest",
"group.id" -> "spark-streaming-consumer-group"
)
val lines = KafkaUtils.createDirectStream[String, String](
ssc,
preferredHosts,
ConsumerStrategies.Subscribe[String, String](topics.distinct, kafkaParams)
)
lines.print()
ssc.start()
但是在我开始流式传输之后,这里什么都没显示。
scala> ssc.start()
18/12/19 15:50:07 WARN streaming.StreamingContext:DynamicAllocation is enabled for this application.Enabling Dynamic allocation for Spark Streaming applications can cause data loss if Write Ahead Log is not enabled for non-replayable sources like Flume. See the programming guide for details on how to enable the Write Ahead Log.
请向我建议一种绕过此问题的方法。
谢谢。
答案 0 :(得分:0)
您应该设置spark.streaming.dynamicAllocation.enable = false。 有关更多说明,您可以访问 Dynamic Allocation for Spark Streaming