通过KryoSerializationWrapper解决Task无法序列化的异常

时间:2019-12-19 10:00:49

标签: scala apache-spark

请参考以下问题:Task not serializable: java.io.NotSerializableException when calling function outside closure only on classes not objects

操作员添加了失败的代码:

object NOTworking extends App {
  new testing().doIT
}

//adding extends Serializable wont help
class testing {  
  val list = List(1,2,3)  
  val rddList = Spark.ctx.parallelize(list)

  def doIT =  {
    //again calling the fucntion someFunc 
    val after = rddList.map(someFunc(_))
    //this will crash (spark lazy)
    after.collect().map(println(_))
  }

  def someFunc(a:Int) = a+1
}

然后Nilesh建议使用KryoSerializationWrapper的解决方案:

def genMapper(kryoWrapper: KryoSerializationWrapper[(Foo => Bar)])
               (foo: Foo) : Bar = {
    kryoWrapper.value.apply(foo)
}
val mapper = genMapper(KryoSerializationWrapper(new Blah(abc))) _
rdd.flatMap(mapper).collectAsMap()

object Blah(abc: ABC) extends (Foo => Bar) {
    def apply(foo: Foo) : Bar = { //This is the real function }
}

但是Nilesh没有显示如何更改操作代码以使用genMapper。谁能用genMapper解决方案编写op的NOTworking代码?

0 个答案:

没有答案