启用了检查点的Spark Streaming中的java.io.NotSerializableException

时间:2016-07-22 09:13:20

标签: apache-spark spark-streaming rdd

以下代码:

def main(args: Array[String]) {
    val sc = new SparkContext
    val sec = Seconds(3)
    val ssc = new StreamingContext(sc, sec)
    ssc.checkpoint("./checkpoint")
    val rdd = ssc.sparkContext.parallelize(Seq("a","b","c"))
    val inputDStream = new ConstantInputDStream(ssc, rdd)

    inputDStream.transform(rdd => {
        val buf = ListBuffer[String]()
        buf += "1"
        buf += "2"
        buf += "3"
        val other_rdd = ssc.sparkContext.parallelize(buf)   // create a new rdd
        rdd.union(other_rdd)
    }).print()

    ssc.start()
    ssc.awaitTermination()
}

并抛出异常:

java.io.NotSerializableException: DStream checkpointing has been enabled but the DStreams with their functions are not serializable
org.apache.spark.streaming.StreamingContext
Serialization stack:
    - object not serializable (class: org.apache.spark.streaming.StreamingContext, value: org.apache.spark.streaming.StreamingContext@5626e185)
    - field (class: com.mirrtalk.Test$$anonfun$main$1, name: ssc$1, type: class org.apache.spark.streaming.StreamingContext)
    - object (class com.mirrtalk.Test$$anonfun$main$1, <function1>)
    - field (class: org.apache.spark.streaming.dstream.DStream$$anonfun$transform$1$$anonfun$apply$21, name: cleanedF$2, type: interface scala.Function1)
    - object (class org.apache.spark.streaming.dstream.DStream$$anonfun$transform$1$$anonfun$apply$21, <function2>)
    - field (class: org.apache.spark.streaming.dstream.DStream$$anonfun$transform$2$$anonfun$5, name: cleanedF$3, type: interface scala.Function2)
    - object (class org.apache.spark.streaming.dstream.DStream$$anonfun$transform$2$$anonfun$5, <function2>)
    - field (class: org.apache.spark.streaming.dstream.TransformedDStream, name: transformFunc, type: interface scala.Function2)

当我删除代码ssc.checkpoint(&#34; ./ checkpoint&#34;)时,应用程序可以正常工作,但我需要启用检查点。

启用检查点时如何解决此问题?

1 个答案:

答案 0 :(得分:2)

您可以在include_vagrantfile = File.expand_path("../include/_Vagrantfile", __FILE__) load include_vagrantfile if File.exist?(include_vagrantfile) Vagrant.configure("2") do |config| config.vm.base_mac = "0223C61ABA59" config.ssh.username = "ubuntu" config.ssh.password = "86f7d0e04910475d8789aa8f" config.vm.synced_folder '.', '/vagrant', disabled: true config.vm.provider "virtualbox" do |vb| vb.customize [ "modifyvm", :id, "--uart1", "0x3F8", "4" ] end end 之外移动上下文初始化和配置任务:

main