我是新来的火花。我在我的mac上以独立模式运行Spark。我带来了主人和工人,他们都很好。主服务器的日志文件如下所示:
...
14/02/25 18:52:43 INFO Slf4jLogger: Slf4jLogger started
14/02/25 18:52:43 INFO Remoting: Starting remoting
14/02/25 18:52:43 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkMaster@Shirishs-MacBook-Pro.local:7077]
14/02/25 18:52:43 INFO Master: Starting Spark master at spark://Shirishs-MacBook-Pro.local:7077
14/02/25 18:52:43 INFO MasterWebUI: Started Master web UI at http://192.168.1.106:8080
14/02/25 18:52:43 INFO Master: I have been elected leader! New state: ALIVE
14/02/25 18:53:03 INFO Master: Registering worker Shirishs-MacBook-Pro.local:53956 with 4 cores, 15.0 GB RAM
工作日志如下所示:
14/02/25 18:53:02 INFO Slf4jLogger: Slf4jLogger started
14/02/25 18:53:02 INFO Remoting: Starting remoting
14/02/25 18:53:02 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkWorker@192.168.1.106:53956]
14/02/25 18:53:02 INFO Worker: Starting Spark worker 192.168.1.106:53956 with 4 cores, 15.0 GB RAM
14/02/25 18:53:02 INFO Worker: Spark home: /Users/shirish_kumar/Developer/spark-0.9.0-incubating
14/02/25 18:53:02 INFO WorkerWebUI: Started Worker web UI at http://192.168.1.106:8081
14/02/25 18:53:02 INFO Worker: Connecting to master spark://Shirishs-MacBook-Pro.local:7077...
14/02/25 18:53:03 INFO Worker: Successfully registered with master spark://Shirishs-MacBook-Pro.local:7077
现在,当我提交作业时,作业无法执行(因为找不到类错误)但是工作者也死了。这是主日志:
14/02/25 18:55:52 INFO Master: Driver submitted org.apache.spark.deploy.worker.DriverWrapper
14/02/25 18:55:52 INFO Master: Launching driver driver-20140225185552-0000 on worker worker-20140225185302-192.168.1.106-53956
14/02/25 18:55:55 INFO Master: Registering worker Shirishs-MacBook-Pro.local:53956 with 4 cores, 15.0 GB RAM
14/02/25 18:55:55 INFO Master: Attempted to re-register worker at same address: akka.tcp://sparkWorker@192.168.1.106:53956
14/02/25 18:55:55 WARN Master: Got heartbeat from unregistered worker worker-20140225185555-192.168.1.106-53956
14/02/25 18:55:57 INFO Master: akka.tcp://driverClient@192.168.1.106:53961 got disassociated, removing it.
14/02/25 18:55:57 INFO Master: akka.tcp://driverClient@192.168.1.106:53961 got disassociated, removing it.
14/02/25 18:55:57 INFO LocalActorRef: Message [akka.remote.transport.ActorTransportAdapter$DisassociateUnderlying] from Actor[akka://sparkMaster/deadLetters] to Actor[akka://sparkMaster/system/transports/akkaprotocolmanager.tcp0/akkaProtocol-tcp%3A%2F%2FsparkMaster%40192.168.1.106%3A53962-2#-21389169] was not delivered. [1] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
4/02/25 18:55:57 INFO Master: akka.tcp://driverClient@192.168.1.106:53961 got disassociated, removing it.
14/02/25 18:55:57 ERROR EndpointWriter: AssociationError [akka.tcp://sparkMaster@Shirishs-MacBook-Pro.local:7077] -> [akka.tcp://driverClient@192.168.1.106:53961]: Error [Association failed with [akka.tcp://driverClient@192.168.1.106:53961]] [
akka.remote.EndpointAssociationException: Association failed with [akka.tcp://driverClient@192.168.1.106:53961]
Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: Connection refused: /192.168.1.106:53961
]
...
...
14/02/25 18:55:57 INFO Master: akka.tcp://driverClient@192.168.1.106:53961 got disassociated, removing it.
14/02/25 18:56:03 WARN Master: Got heartbeat from unregistered worker worker-20140225185555-192.168.1.106-53956
14/02/25 18:56:10 WARN Master: Got heartbeat from unregistered worker worker-20140225185555-192.168.1.106-53956
14/02/25 18:56:18 WARN Master: Got heartbeat from unregistered worker worker-20140225185555-192.168.1.106-53956
14/02/25 18:56:25 WARN Master: Got heartbeat from unregistered worker worker-20140225185555-192.168.1.106-53956
14/02/25 18:56:33 WARN Master: Got heartbeat from unregistered worker worker-20140225185555-192.168.1.106-53956
14/02/25 18:56:40 WARN Master: Got heartbeat from unregistered worker worker-20140225185555-192.168.1.106-53956
14/
工作日志看起来像这样
14/02/25 18:55:52 INFO Worker: Asked to launch driver driver-20140225185552-0000
2014-02-25 18:55:52.534 java[11415:330b] Unable to load realm info from SCDynamicStore
14/02/25 18:55:52 INFO DriverRunner: Copying user jar file:/Users/shirish_kumar/Developer/spark_app/SimpleApp to /Users/shirish_kumar/Developer/spark-0.9.0-incubating/work/driver-20140225185552-0000/SimpleApp
14/02/25 18:55:53 INFO DriverRunner: Launch Command: "/Library/Java/JavaVirtualMachines/jdk1.7.0_40.jdk/Contents/Home/bin/java" "-cp" ":/Users/shirish_kumar/Developer/spark-0.9.0-incubating/work/driver-20140225185552-0000/SimpleApp:/Users/shirish_kumar/Developer/spark-0.9.0-incubating/conf:/Users/shirish_kumar/Developer/spark-0.9.0-incubating/assembly/target/scala-2.10/spark-assembly-0.9.0-incubating-hadoop1.0.4.jar" "-Xms512M" "-Xmx512M" "org.apache.spark.deploy.worker.DriverWrapper" "akka.tcp://sparkWorker@192.168.1.106:53956/user/Worker" "SimpleApp"
14/02/25 18:55:55 ERROR OneForOneStrategy: FAILED (of class scala.Enumeration$Val)
scala.MatchError: FAILED (of class scala.Enumeration$Val)
at org.apache.spark.deploy.worker.Worker$$anonfun$receive$1.applyOrElse(Worker.scala:277)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
at akka.actor.ActorCell.invoke(ActorCell.scala:456)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
at akka.dispatch.Mailbox.run(Mailbox.scala:219)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
14/02/25 18:55:55 INFO Worker: Starting Spark worker 192.168.1.106:53956 with 4 cores, 15.0 GB RAM
14/02/25 18:55:55 INFO Worker: Spark home: /Users/shirish_kumar/Developer/spark-0.9.0-incubating
14/02/25 18:55:55 INFO WorkerWebUI: Started Worker web UI at http://192.168.1.106:8081
14/02/25 18:55:55 INFO Worker: Connecting to master spark://Shirishs-MacBook-Pro.local:7077...
14/02/25 18:55:55 INFO Worker: Successfully registered with master spark://Shirishs-MacBook-Pro.local:7077
在webUI之后 - 工作人员显示已经死亡。
我的问题是 - 有没有人遇到过这个问题。如果工作失败,工人不应该死。
答案 0 :(得分:0)
检查你/ Spark /工作文件夹。 您可以看到该特定驱动程序的确切错误。
对我来说,它找不到类的异常。只需为应用程序主类提供完全限定的类名(也包括包名)。
然后清除工作目录并再次以独立模式再次启动应用程序。 这将有效....!
答案 1 :(得分:0)
您必须指定JAR文件的路径。
务实地说,你可以这样做:
sparkConf.set("spark.jars", "file:/myjar1, file:/myjarN")
这意味着你必须先编译一个JAR文件。
您还必须链接相关的JAR - 其中有多种自动化方式,但超出了此问题的范围。