Serializable
我是初学者,我的本地机器上运行了高级代码,它已经确保Person实现17/09/30 18:13:26 INFO SparkContext: Starting job: count at App.java:32
17/09/30 18:13:26 INFO DAGScheduler: Got job 0 (count at App.java:32) with 1 output partitions
17/09/30 18:13:26 INFO DAGScheduler: Final stage: ResultStage 0 (count at App.java:32)
17/09/30 18:13:26 INFO DAGScheduler: Parents of final stage: List()
17/09/30 18:13:26 INFO DAGScheduler: Missing parents: List()
17/09/30 18:13:26 INFO DAGScheduler: Submitting ResultStage 0 (MapPartitionsRDD[4] at rdd at App.java:32), which has no missing parents
17/09/30 18:13:26 INFO TaskSchedulerImpl: Cancelling stage 0
17/09/30 18:13:26 INFO DAGScheduler: ResultStage 0 (count at App.java:32) failed in Unknown s due to Job aborted due to stage failure: Task not serializable: java.io.NotSerializableException: scala.collection.Iterator$$anon$11
Serialization stack:
- object not serializable (class: scala.collection.Iterator$$anon$11, value: empty iterator)
- field (class: scala.collection.Iterator$$anonfun$toStream$1, name: $outer, type: interface scala.collection.Iterator)
- object (class scala.collection.Iterator$$anonfun$toStream$1, <function0>)
- field (class: scala.collection.immutable.Stream$Cons, name: tl, type: interface scala.Function0)
- object (class scala.collection.immutable.Stream$Cons, Stream([27,panfei]))
- field (class: org.apache.spark.sql.execution.LocalTableScanExec, name: rows, type: interface scala.collection.Seq)
- object (class org.apache.spark.sql.execution.LocalTableScanExec, LocalTableScan [age#0, name#1]
)
- field (class: org.apache.spark.sql.execution.DeserializeToObjectExec, name: child, type: class org.apache.spark.sql.execution.SparkPlan)
- object (class org.apache.spark.sql.execution.DeserializeToObjectExec, DeserializeToObject createexternalrow(age#0, name#1.toString, StructField(age,IntegerType,true), StructField(name,StringType,true)), obj#6: org.apache.spark.sql.Row
+- LocalTableScan [age#0, name#1]
)
- field (class: org.apache.spark.sql.execution.DeserializeToObjectExec$$anonfun$2, name: $outer, type: class org.apache.spark.sql.execution.DeserializeToObjectExec)
- object (class org.apache.spark.sql.execution.DeserializeToObjectExec$$anonfun$2, <function2>)
- field (class: org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndexInternal$1, name: f$22, type: interface scala.Function2)
- object (class org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndexInternal$1, <function0>)
- field (class: org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndexInternal$1$$anonfun$apply$24, name: $outer, type: class org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndexInternal$1)
- object (class org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndexInternal$1$$anonfun$apply$24, <function3>)
- field (class: org.apache.spark.rdd.MapPartitionsRDD, name: f, type: interface scala.Function3)
- object (class org.apache.spark.rdd.MapPartitionsRDD, MapPartitionsRDD[2] at rdd at App.java:32)
- field (class: org.apache.spark.NarrowDependency, name: _rdd, type: class org.apache.spark.rdd.RDD)
- object (class org.apache.spark.OneToOneDependency, org.apache.spark.OneToOneDependency@3420d0d9)
- writeObject data (class: scala.collection.immutable.$colon$colon)
- object (class scala.collection.immutable.$colon$colon, List(org.apache.spark.OneToOneDependency@3420d0d9))
- field (class: org.apache.spark.rdd.RDD, name: org$apache$spark$rdd$RDD$$dependencies_, type: interface scala.collection.Seq)
- object (class org.apache.spark.rdd.MapPartitionsRDD, MapPartitionsRDD[3] at rdd at App.java:32)
- field (class: org.apache.spark.NarrowDependency, name: _rdd, type: class org.apache.spark.rdd.RDD)
- object (class org.apache.spark.OneToOneDependency, org.apache.spark.OneToOneDependency@7342323d)
- writeObject data (class: scala.collection.immutable.$colon$colon)
- object (class scala.collection.immutable.$colon$colon, List(org.apache.spark.OneToOneDependency@7342323d))
- field (class: org.apache.spark.rdd.RDD, name: org$apache$spark$rdd$RDD$$dependencies_, type: interface scala.collection.Seq)
- object (class org.apache.spark.rdd.MapPartitionsRDD, MapPartitionsRDD[4] at rdd at App.java:32)
- field (class: scala.Tuple2, name: _1, type: class java.lang.Object)
- object (class scala.Tuple2, (MapPartitionsRDD[4] at rdd at App.java:32,<function2>))
17/09/30 18:13:26 INFO DAGScheduler: Job 0 failed: count at App.java:32, took 0.098482 s
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task not serializable: java.io.NotSerializableException: scala.collection.Iterator$$anon$11
Serialization stack:
- object not serializable (class: scala.collection.Iterator$$anon$11, value: empty iterator)
- field (class: scala.collection.Iterator$$anonfun$toStream$1, name: $outer, type: interface scala.collection.Iterator)
- object (class scala.collection.Iterator$$anonfun$toStream$1, <function0>)
- field (class: scala.collection.immutable.Stream$Cons, name: tl, type: interface scala.Function0)
- object (class scala.collection.immutable.Stream$Cons, Stream([27,panfei]))
- field (class: org.apache.spark.sql.execution.LocalTableScanExec, name: rows, type: interface scala.collection.Seq)
- object (class org.apache.spark.sql.execution.LocalTableScanExec, LocalTableScan [age#0, name#1]
)
- field (class: org.apache.spark.sql.execution.DeserializeToObjectExec, name: child, type: class org.apache.spark.sql.execution.SparkPlan)
- object (class org.apache.spark.sql.execution.DeserializeToObjectExec, DeserializeToObject createexternalrow(age#0, name#1.toString, StructField(age,IntegerType,true), StructField(name,StringType,true)), obj#6: org.apache.spark.sql.Row
+- LocalTableScan [age#0, name#1]
)
- field (class: org.apache.spark.sql.execution.DeserializeToObjectExec$$anonfun$2, name: $outer, type: class org.apache.spark.sql.execution.DeserializeToObjectExec)
- object (class org.apache.spark.sql.execution.DeserializeToObjectExec$$anonfun$2, <function2>)
- field (class: org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndexInternal$1, name: f$22, type: interface scala.Function2)
- object (class org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndexInternal$1, <function0>)
- field (class: org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndexInternal$1$$anonfun$apply$24, name: $outer, type: class org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndexInternal$1)
- object (class org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndexInternal$1$$anonfun$apply$24, <function3>)
- field (class: org.apache.spark.rdd.MapPartitionsRDD, name: f, type: interface scala.Function3)
- object (class org.apache.spark.rdd.MapPartitionsRDD, MapPartitionsRDD[2] at rdd at App.java:32)
- field (class: org.apache.spark.NarrowDependency, name: _rdd, type: class org.apache.spark.rdd.RDD)
- object (class org.apache.spark.OneToOneDependency, org.apache.spark.OneToOneDependency@3420d0d9)
- writeObject data (class: scala.collection.immutable.$colon$colon)
- object (class scala.collection.immutable.$colon$colon, List(org.apache.spark.OneToOneDependency@3420d0d9))
- field (class: org.apache.spark.rdd.RDD, name: org$apache$spark$rdd$RDD$$dependencies_, type: interface scala.collection.Seq)
- object (class org.apache.spark.rdd.MapPartitionsRDD, MapPartitionsRDD[3] at rdd at App.java:32)
- field (class: org.apache.spark.NarrowDependency, name: _rdd, type: class org.apache.spark.rdd.RDD)
- object (class org.apache.spark.OneToOneDependency, org.apache.spark.OneToOneDependency@7342323d)
- writeObject data (class: scala.collection.immutable.$colon$colon)
- object (class scala.collection.immutable.$colon$colon, List(org.apache.spark.OneToOneDependency@7342323d))
- field (class: org.apache.spark.rdd.RDD, name: org$apache$spark$rdd$RDD$$dependencies_, type: interface scala.collection.Seq)
- object (class org.apache.spark.rdd.MapPartitionsRDD, MapPartitionsRDD[4] at rdd at App.java:32)
- field (class: scala.Tuple2, name: _1, type: class java.lang.Object)
- object (class scala.Tuple2, (MapPartitionsRDD[4] at rdd at App.java:32,<function2>))
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1435)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1423)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1422)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1422)
at org.apache.spark.scheduler.DAGScheduler.submitMissingTasks(DAGScheduler.scala:1000)
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitStage(DAGScheduler.scala:918)
at org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:862)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1613)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1605)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1594)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:628)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1918)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1931)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1944)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1958)
at org.apache.spark.rdd.RDD.count(RDD.scala:1157)
at com.ctrip.market.dmp.spark.app.App.main(App.java:32)
Caused by: java.io.NotSerializableException: scala.collection.Iterator$$anon$11
Serialization stack:
- object not serializable (class: scala.collection.Iterator$$anon$11, value: empty iterator)
- field (class: scala.collection.Iterator$$anonfun$toStream$1, name: $outer, type: interface scala.collection.Iterator)
- object (class scala.collection.Iterator$$anonfun$toStream$1, <function0>)
- field (class: scala.collection.immutable.Stream$Cons, name: tl, type: interface scala.Function0)
- object (class scala.collection.immutable.Stream$Cons, Stream([27,panfei]))
- field (class: org.apache.spark.sql.execution.LocalTableScanExec, name: rows, type: interface scala.collection.Seq)
- object (class org.apache.spark.sql.execution.LocalTableScanExec, LocalTableScan [age#0, name#1]
)
- field (class: org.apache.spark.sql.execution.DeserializeToObjectExec, name: child, type: class org.apache.spark.sql.execution.SparkPlan)
- object (class org.apache.spark.sql.execution.DeserializeToObjectExec, DeserializeToObject createexternalrow(age#0, name#1.toString, StructField(age,IntegerType,true), StructField(name,StringType,true)), obj#6: org.apache.spark.sql.Row
+- LocalTableScan [age#0, name#1]
)
- field (class: org.apache.spark.sql.execution.DeserializeToObjectExec$$anonfun$2, name: $outer, type: class org.apache.spark.sql.execution.DeserializeToObjectExec)
- object (class org.apache.spark.sql.execution.DeserializeToObjectExec$$anonfun$2, <function2>)
- field (class: org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndexInternal$1, name: f$22, type: interface scala.Function2)
- object (class org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndexInternal$1, <function0>)
- field (class: org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndexInternal$1$$anonfun$apply$24, name: $outer, type: class org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndexInternal$1)
- object (class org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndexInternal$1$$anonfun$apply$24, <function3>)
- field (class: org.apache.spark.rdd.MapPartitionsRDD, name: f, type: interface scala.Function3)
- object (class org.apache.spark.rdd.MapPartitionsRDD, MapPartitionsRDD[2] at rdd at App.java:32)
- field (class: org.apache.spark.NarrowDependency, name: _rdd, type: class org.apache.spark.rdd.RDD)
- object (class org.apache.spark.OneToOneDependency, org.apache.spark.OneToOneDependency@3420d0d9)
- writeObject data (class: scala.collection.immutable.$colon$colon)
- object (class scala.collection.immutable.$colon$colon, List(org.apache.spark.OneToOneDependency@3420d0d9))
- field (class: org.apache.spark.rdd.RDD, name: org$apache$spark$rdd$RDD$$dependencies_, type: interface scala.collection.Seq)
- object (class org.apache.spark.rdd.MapPartitionsRDD, MapPartitionsRDD[3] at rdd at App.java:32)
- field (class: org.apache.spark.NarrowDependency, name: _rdd, type: class org.apache.spark.rdd.RDD)
- object (class org.apache.spark.OneToOneDependency, org.apache.spark.OneToOneDependency@7342323d)
- writeObject data (class: scala.collection.immutable.$colon$colon)
- object (class scala.collection.immutable.$colon$colon, List(org.apache.spark.OneToOneDependency@7342323d))
- field (class: org.apache.spark.rdd.RDD, name: org$apache$spark$rdd$RDD$$dependencies_, type: interface scala.collection.Seq)
- object (class org.apache.spark.rdd.MapPartitionsRDD, MapPartitionsRDD[4] at rdd at App.java:32)
- field (class: scala.Tuple2, name: _1, type: class java.lang.Object)
- object (class scala.Tuple2, (MapPartitionsRDD[4] at rdd at App.java:32,<function2>))
at org.apache.spark.serializer.SerializationDebugger$.improveException(SerializationDebugger.scala:40)
at org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:46)
at org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:100)
at org.apache.spark.scheduler.DAGScheduler.submitMissingTasks(DAGScheduler.scala:993)
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitStage(DAGScheduler.scala:918)
at org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:862)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1613)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1605)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1594)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
,但是代码抛出了这个异常:
{{1}}
答案 0 :(得分:1)
通过将scala版本从2.10更改为2.11已解决此问题。在任何地方,我都很困惑。
答案 1 :(得分:-1)
您的主类不需要实现Serializable。