在一台Mac上的Play Framework + Spark Master / Worker上运行Spark作业

时间:2014-12-12 07:41:26

标签: scala playframework apache-spark

我正在尝试在一台Mac上运行Playframework + Spark Master / Worker上的Spark工作。

当工作开始时,我遇到了java.lang.ClassNotFoundException

你会教我如何解决它吗?

Here is code in Github

Envrionments:

Mac 10.9.5
Java 1.7.0_71
Play 2.2.3
Spark 1.1.1

设置历史记录:

> cd ~
> git clone git@github.com:apache/spark.git
> cd spark
> git checkout -b v1.1.1 v1.1.1
> sbt/sbt assembly
> vi ~/.bashrc
export SPARK_HOME=/Users/tomoya/spark
> . ~/.bashrc
> hostname
Tomoya-Igarashis-MacBook-Air.local
> vi $SPARK_HOME/conf/slaves
Tomoya-Igarashis-MacBook-Air.local
> play new spark_cluster_sample
default name
type -> scala

运行历史记录:

> $SPARK_HOME/sbin/start-all.sh
> jps
> which play
/Users/tomoya/play/play
> git clone https://github.com/TomoyaIgarashi/spark_cluster_sample
> cd spark_cluster_sample
> play run

错误追踪:

play.api.Application$$anon$1: Execution exception[[SparkException: Job aborted due to stage failure: Task 1 in stage 0.0 failed 4 times, most recent failure: Lost task 1.3 in stage 0.0 (TID 6, 192.168.1.29):
    java.lang.ClassNotFoundException: controllers.Application$$anonfun$index$1$$anonfun$3
    java.net.URLClassLoader$1.run(URLClassLoader.java:372)
    java.net.URLClassLoader$1.run(URLClassLoader.java:361)
    java.security.AccessController.doPrivileged(Native Method)
    java.net.URLClassLoader.findClass(URLClassLoader.java:360)
    java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    java.lang.Class.forName0(Native Method)
    java.lang.Class.forName(Class.java:340)
    org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:59)
    java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1613)
    java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518)
    java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774)
    java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
    java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1993)
    java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1918)
    java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
    java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
    java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1993)
    java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1918)
    java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
    java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
    java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1993)
    java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1918)
    java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
    java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
    java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
    org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:62)
    org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:87)
    org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:57)
    org.apache.spark.scheduler.Task.run(Task.scala:54)
    org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:178)
    java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    java.lang.Thread.run(Thread.java:745)
Driver stacktrace:]]
    at play.api.Application$class.handleError(Application.scala:293) ~[play_2.10.jar:2.2.3]
    at play.api.DefaultApplication.handleError(Application.scala:399) [play_2.10.jar:2.2.3]
    at play.core.server.netty.PlayDefaultUpstreamHandler$$anonfun$13$$anonfun$apply$1.applyOrElse(PlayDefaultUpstreamHandler.scala:166) [play_2.10.jar:2.2.3]
    at play.core.server.netty.PlayDefaultUpstreamHandler$$anonfun$13$$anonfun$apply$1.applyOrElse(PlayDefaultUpstreamHandler.scala:163) [play_2.10.jar:2.2.3]
    at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:33) [scala-library-2.10.4.jar:na]
    at scala.util.Failure$$anonfun$recover$1.apply(Try.scala:185) [scala-library-2.10.4.jar:na]
Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 0.0 failed 4 times, most recent failure: Lost task 1.3 in stage 0.0 (TID 6, 192.168.1.29): java.lang.ClassNotFoundException: controllers.Application$$anonfun$index$1$$anonfun$3
    java.net.URLClassLoader$1.run(URLClassLoader.java:372)
    java.net.URLClassLoader$1.run(URLClassLoader.java:361)
    java.security.AccessController.doPrivileged(Native Method)
    java.net.URLClassLoader.findClass(URLClassLoader.java:360)
    java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    java.lang.Class.forName0(Native Method)
    java.lang.Class.forName(Class.java:340)
    org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:59)
    java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1613)
    java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518)
    java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774)
    java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
    java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1993)
    java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1918)
    java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
    java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
    java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1993)
    java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1918)
    java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
    java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
    java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1993)
    java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1918)
    java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
    java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
    java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
    org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:62)
    org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:87)
    org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:57)
    org.apache.spark.scheduler.Task.run(Task.scala:54)
    org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:178)
    java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    java.lang.Thread.run(Thread.java:745)
Driver stacktrace:
    at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1185) ~[spark-core_2.10-1.1.1.jar:1.1.1]
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1174) ~[spark-core_2.10-1.1.1.jar:1.1.1]
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1173) ~[spark-core_2.10-1.1.1.jar:1.1.1]
    at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) ~[scala-library-2.10.4.jar:na]
    at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) ~[scala-library-2.10.4.jar:na]
    at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1173) ~[spark-core_2.10-1.1.1.jar:1.1.1]

此致

1 个答案:

答案 0 :(得分:0)

我在示例应用程序中找到了您的最后一次提交。 https://github.com/TomoyaIgarashi/spark_cluster_sample/commit/5e16a4d9291e83437ce2e80015bb6558df6e7feb 那是否修复了类未找到的异常?使用相同的方法添加jar后,我遇到了其他一些NullPointer问题。谢谢!