修改 此问题与Resolving dependency problems in Apache Spark:
不重复i => i
我运行此代码
object Tmp {
def main(args: Array[String]): Unit = {
val sparkConf = new SparkConf()
.setAppName("Spark Test")
.setMaster("spark://[ip-of-spark-master]:7077")
val sc = new SparkContext(sparkConf)
sc.parallelize(0 until 10000).map(i => i).count()
}
}
当我在计算机上运行它而没有发送任何jar时,java.lang.ClassNotFoundException: Tmp$$anonfun$main$1
失败了。如果我不执行map(i => i)
(即,找不到函数i => i
),它就可以工作。
要明确:
现在我认为,对于简单的功能,spark会序列化它,并且不需要发送jar。我很确定我以前做过。
我错过了什么吗?
查看堆栈跟踪,它在驱动程序上失败,所以在我身边:
Driver stacktrace:
[error] at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1454)
[error] at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1442)
[error] at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1441)
...
[error] Caused by: java.lang.ClassNotFoundException: Tmp$$anonfun$1
[error] at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
[error] at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
[error] at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
[error] at java.lang.Class.forName0(Native Method)
[error] at java.lang.Class.forName(Class.java:348)
[error] at org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:67)
...