对于理解错误:org.apache.spark.SparkException:任务无法序列化

时间:2019-12-09 05:02:51

标签: scala apache-spark rdd

我收到此错误:

org.apache.spark.SparkException: Task not serializable
at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:345)
at org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:335)
at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:159)
at org.apache.spark.SparkContext.clean(SparkContext.scala:2299)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1.apply(RDD.scala:797)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1.apply(RDD.scala:796)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:363)
at org.apache.spark.rdd.RDD.mapPartitions(RDD.scala:796)
at com.stellarloyalty.batch.jobs.RefreshSegmentsJob$$anonfun$run$1.apply(RefreshSegmentsJob.scala:111)
at com.stellarloyalty.batch.jobs.RefreshSegmentsJob$$anonfun$run$1.apply(RefreshSegmentsJob.scala:73)
at scala.util.Try$.apply(Try.scala:192)

罪魁祸首是这里的理解力:

                  for {
                    s <- sList
                    acc = accumulators.get(s)
                  } {
                    acc.add(1)
                  }

请帮助,我该如何序列化以获得理解?

0 个答案:

没有答案