Kafka版本2的Spark结构化流

时间:2018-10-14 12:04:43

标签: apache-kafka spark-structured-streaming

我们将kafka版本从0.9更新到了2.0

需要帮助以找到合适的客户端库以进行Spark结构化流式传输

"org.apache.spark" %% "spark-streaming-kafka-0-10" % "2.3.0"

不起作用。

这是它引发的错误:

11:46:18.061 [stream execution thread for [id = e393ea37-8009-4ce0-b996-94f767994fb8, runId = bc15eb7d-876d-4e01-8ee5-22205ec7fdcb]] DEBUG org.apache.kafka.clients.NetworkClient - [Consumer clientId=consumer-2, groupId=spark-kafka-source-8ce7f26f-e342-4b0d-85f1-a9f641b79629-1052905425-driver-0] Completed connection to node -1. Fetching API versions. 11:46:18.061 [stream execution thread for [id = e393ea37-8009-4ce0-b996-94f767994fb8, runId = bc15eb7d-876d-4e01-8ee5-22205ec7fdcb]] DEBUG org.apache.kafka.clients.NetworkClient - [Consumer clientId=consumer-2, groupId=spark-kafka-source-8ce7f26f-e342-4b0d-85f1-a9f641b79629-1052905425-driver-0] Initiating API versions fetch from node -1. 11:46:18.452 [stream execution thread for [id = e393ea37-8009-4ce0-b996-94f767994fb8, runId = bc15eb7d-876d-4e01-8ee5-22205ec7fdcb]] DEBUG org.apache.kafka.common.network.Selector - [Consumer clientId=consumer-2, groupId=spark-kafka-source-8ce7f26f-e342-4b0d-85f1-a9f641b79629-1052905425-driver-0] Connection with kafka-muhammad-45e0.aivencloud.com/18.203.67.147 disconnected java.io.EOFException: null at org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:119) at org.apache.kafka.common.network.KafkaChannel.receive(KafkaChannel.java:335) at org.apache.kafka.common.network.KafkaChannel.read(KafkaChannel.java:296) at org.apache.kafka.common.network.Selector.attemptRead(Selector.java:560)

1 个答案:

答案 0 :(得分:1)

请提供更多信息(错误如何?!)。 Kafka经纪人被设计为向后兼容,这有助于大大落后于最新API(例如Spark)的项目。

您引用的spark-streaming-kafka-0-10的artifactId用于Spark Streaming,为了使用Spark Structured Streaming,您需要使用spark-sql-kafka-0-10_2.11