我尝试通过Kafka实现wordCount,使用mapWithState函数时出现错误“类型不匹配”。
这是我的代码:
// make a connection to Kafka and read (key, value) pairs from it
val sparkConf = new SparkConf().setAppName("DirectKafkaAvg").setMaster("local[2]")
val ssc = new StreamingContext(sparkConf, Seconds(2))
val kafkaConf = Map(
"metadata.broker.list" -> "localhost:9092",
"zookeeper.connect" -> "localhost:2181",
"group.id" -> "kafka-spark-streaming",
"zookeeper.connection.timeout.ms" -> "1000")
val topics = Set("avg")
val messages = KafkaUtils.createDirectStream[String, String, StringDecoder, StringDecoder](ssc, kafkaConf, topics)
val value = messages.map{case (key, value) => value.split(',')}
val pairs = value.map(record => (record(1), record(2)))
// measure the average value for each key in a stateful manner
def mappingFunc(key: String, value: Option[Double], state: State[Double]): Option[(String, Double)] = {
val sum = value.getOrElse(0.0) + state.getOption.getOrElse(0.0)
val output = Option(key, sum)
state.update(sum)
output
}
val spec = StateSpec.function(mappingFunc _)
val stateDstream = pairs.mapWithState(spec)
// store the result in Cassandra
stateDstream.print()
ssc.start()
ssc.awaitTermination()
这是错误日志:
[error] KafkaSpark.scala:50: type mismatch;
[error] found : org.apache.spark.streaming.StateSpec[String,Double,Double,Option[(String, Double)]]
[error] required: org.apache.spark.streaming.StateSpec[String,String,?,?]
[error] val stateDstream = pairs.mapWithState(spec)
[error] ^
[error] one error found
[error] (compile:compileIncremental) Compilation failed
有人知道如何处理吗?
答案 0 :(得分:1)
您的代码中的pair
流是一对字符串,但是您的mappingFunc
假设该对的第二个值是Double类型。尝试更改线
val pairs = value.map(record => (record(1), record(2)))
到
val pairs = value.map(record => (record(1), record(2).toDouble))
答案 1 :(得分:0)
您必须添加类型参数
val spec = StateSpec.function[String,Double,Double,Option[(String, Double)]](mappingFunc _)