java.io.NotSerializableException:已启用DStream检查点,但DStream及其功能不可序列化

时间:2016-07-20 16:36:02

标签: java scala apache-spark stream

在Spark Streaming中执行此代码时,出现了序列化错误(见下文):

  val endPoint = ConfigFactory.load("application.conf").getConfig("conf").getString("endPoint")
  val operation = ConfigFactory.load("application.conf").getConfig("conf").getString("operation")
  val param = ConfigFactory.load("application.conf").getConfig("conf").getString("param")

    result.foreachRDD{jsrdd =>
      jsrdd.map(jsobj => {
        val docId = (jsobj \ "id").as[JsString].value
        val response: HttpResponse[String] = Http(apiURL + "/" + endPoint + "/" + docId + "/" + operation).timeout(connTimeoutMs = 1000, readTimeoutMs = 5000).param(param,jsobj.toString()).asString
        val output = Json.parse(response.body) \ "annotation" \ "tags"
        jsobj.as[JsObject] + ("tags", output.as[JsObject])
    })}

所以,据我所知,问题在于scalaj Http api。我该如何解决这个问题?显然我无法改变api。

  

java.io.NotSerializableException:DStream检查点已经存在   启用但DStreams及其功能不可序列化   org.consumer.kafka.KafkaJsonConsumer   序列化堆栈:      - 对象不可序列化(类:org.consumer.kafka.KafkaJsonConsumer,   值:   org.consumer.kafka.KafkaJsonConsumer@f91da5e)      - field(类:org.consumer.kafka.KafkaJsonConsumer $$ anonfun $ run $ 1,   name:$ outer,type:class   org.consumer.kafka.KafkaJsonConsumer)      - 对象(类org.consumer.kafka.KafkaJsonConsumer $$ anonfun $ run $ 1,   )      - field(类:org.apache.spark.streaming.dstream.DStream $$ anonfun $ foreachRDD $ 1 $$ anonfun $ apply $ mcV $ sp $ 3,   name:cleaningF $ 1,输入:interface scala.Function1)      - object(类org.apache.spark.streaming.dstream.DStream $$ anonfun $ foreachRDD $ 1 $$ anonfun $ apply $ mcV $ sp $ 3,   )      - writeObject数据(类:org.apache.spark.streaming.dstream.DStream)      - object(类org.apache.spark.streaming.dstream.ForEachDStream,org.apache.spark.streaming.dstream.ForEachDStream@761956ac)      - writeObject数据(类:org.apache.spark.streaming.dstream.DStreamCheckpointData)      - object(类org.apache.spark.streaming.dstream.DStreamCheckpointData,[0   检查点文件

     

])      - writeObject数据(类:org.apache.spark.streaming.dstream.DStream)      - object(类org.apache.spark.streaming.dstream.ForEachDStream,org.apache.spark.streaming.dstream.ForEachDStream@704641e3)      - 数组元素(索引:0)      - array(class [Ljava.lang.Object;,size 16)      - field(类:scala.collection.mutable.ArrayBuffer,name:array,type:class [Ljava.lang.Object;)      - object(类scala.collection.mutable.ArrayBuffer,ArrayBuffer(org.apache.spark.streaming.dstream.ForEachDStream@704641e3,   org.apache.spark.streaming.dstream.ForEachDStream@761956ac))      - writeObject数据(类:org.apache.spark.streaming.dstream.DStreamCheckpointData)      - object(类org.apache.spark.streaming.dstream.DStreamCheckpointData,[0   检查点文件

     

])

0 个答案:

没有答案