我正在尝试使用结构化流API与Spark中的Kafka集成从Spark集群中读取Kafka主题
myobj1.map(items =>
{
if(items.MNGR_NAME) {
return items.MNGR_NAME;
}else {
//do want you want.
}
})
创建Kafka流
val sparkSession = SparkSession.builder()
.master("local[*]")
.appName("some-app")
.getOrCreate()
使用命令运行它
import sparkSession.implicits._
val dataFrame = sparkSession
.readStream
.format("kafka")
.option("subscribepattern", "preprod-*")
.option("kafka.bootstrap.servers", "<brokerUrl>:9094")
.option("kafka.ssl.protocol", "TLS")
.option("kafka.security.protocol", "SSL")
.option("kafka.ssl.key.password", secretPassword)
.option("kafka.ssl.keystore.location", "/tmp/xyz.jks")
.option("kafka.ssl.keystore.password", secretPassword)
.option("kafka.ssl.truststore.location", "/abc.jks")
.option("kafka.ssl.truststore.password", secretPassword)
.load()
.selectExpr("CAST(key AS STRING)", "CAST(value AS STRING)")
.as[(String, String)]
.writeStream
.format("console")
.start()
.awaitTermination()
出现以下错误
/usr/local/spark/bin/spark-submit
--packages "org.apache.spark:spark-streaming-kafka-0-10_2.11:2.3.1,org.apache.spark:spark-sql-kafka-0-10_2.11:2.3.1"
myjar.jar
答案 0 :(得分:0)
您的Kafka经纪人版本是什么?以及如何生成这些消息?
如果这些消息带有标题(https://issues.apache.org/jira/browse/KAFKA-4208),则您将需要使用Kafka 0.11+来使用它们,因为旧的Kafka客户端无法读取此类消息。如果是这样,您可以使用以下命令:
/usr/local/spark/bin/spark-submit --packages "org.apache.kafka:kafka-clients:0.11.0.3,org.apache.spark:spark-sql-kafka-0-10_2.11:2.3.1"
myjar.jar