Apache Spark无法使用Amazon EC2上的spark-submit脚本连接到master

时间:2016-08-02 12:43:35

标签: apache-spark amazon-ec2

首先,我使用spark-ec2脚本在EC2上设置一个主集群和一个工作节点的Spark集群。

用ssh连接到我的EC2主实例后,我想运行spark-submit脚本,这样我就可以运行自己的Spark代码了。我首先上传我的.jar文件然后使用脚本。

为此我使用以下命令:

sudo /root/spark/bin/spark-submit --class "SimpleApp"\
--master spark://ec2-<adress>.us-west-1.compute.amazonaws.com:7077 simple-project-1.0.jar

可悲的是,这不起作用,因为脚本无法连接到主服务器(最后的整个错误消息):

java.io.IOException: Failed to connect to ec2-<adress>.us-west-1.compute.amazonaws.com/<private-IP>:7077

我将入站规则添加到我的安全组,允许手动访问端口7077并仍然收到相同的错误。 在设置和启动之间可能需要做些什么吗?

[ec2-user@ip-172-31-11-100 ~]$ sudo /root/spark/bin/spark-submit --class "SimpleApp" --master spark://<ec2-address>.us-west-1.compute.amazonaws.com:7077 simple-project-1.0.jar 
16/08/02 12:18:43 INFO spark.SparkContext: Running Spark version 1.6.1
16/08/02 12:18:44 WARN spark.SparkConf: 
SPARK_WORKER_INSTANCES was detected (set to '1').
This is deprecated in Spark 1.0+.

Please instead use:
 - ./spark-submit with --num-executors to specify the number of executors
 - Or set SPARK_EXECUTOR_INSTANCES
 - spark.executor.instances to configure the number of instances in the spark config.

16/08/02 12:18:44 INFO spark.SecurityManager: Changing view acls to: root
16/08/02 12:18:44 INFO spark.SecurityManager: Changing modify acls to: root
16/08/02 12:18:44 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
16/08/02 12:18:45 INFO util.Utils: Successfully started service 'sparkDriver' on port 58516.
16/08/02 12:18:45 INFO slf4j.Slf4jLogger: Slf4jLogger started
16/08/02 12:18:45 INFO Remoting: Starting remoting
16/08/02 12:18:46 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriverActorSystem@172.31.11.100:45559]
16/08/02 12:18:46 INFO util.Utils: Successfully started service 'sparkDriverActorSystem' on port 45559.
16/08/02 12:18:46 INFO spark.SparkEnv: Registering MapOutputTracker
16/08/02 12:18:46 INFO spark.SparkEnv: Registering BlockManagerMaster
16/08/02 12:18:46 INFO storage.DiskBlockManager: Created local directory at /mnt/spark/blockmgr-83f1cf8d-3783-4659-a0da-64ae7c95e850
16/08/02 12:18:46 INFO storage.DiskBlockManager: Created local directory at /mnt2/spark/blockmgr-9a22a761-a18f-45a4-9d49-dcfaf7f9e4f8
16/08/02 12:18:46 INFO storage.MemoryStore: MemoryStore started with capacity 511.5 MB
16/08/02 12:18:46 INFO spark.SparkEnv: Registering OutputCommitCoordinator
16/08/02 12:18:46 INFO server.Server: jetty-8.y.z-SNAPSHOT
16/08/02 12:18:46 INFO server.AbstractConnector: Started SelectChannelConnector@0.0.0.0:4040
16/08/02 12:18:46 INFO util.Utils: Successfully started service 'SparkUI' on port 4040.
16/08/02 12:18:46 INFO ui.SparkUI: Started SparkUI at http://ec2-54-153-24-33.us-west-1.compute.amazonaws.com:4040
16/08/02 12:18:46 INFO spark.HttpFileServer: HTTP File server directory is /mnt/spark/spark-12fdcf09-fcfc-4bf6-98d3-ec1f27d21345/httpd-da6f3d59-bc33-4a06-bac9-cb0c27fd82d9
16/08/02 12:18:46 INFO spark.HttpServer: Starting HTTP Server
16/08/02 12:18:46 INFO server.Server: jetty-8.y.z-SNAPSHOT
16/08/02 12:18:47 INFO server.AbstractConnector: Started SocketConnector@0.0.0.0:59371
16/08/02 12:18:47 INFO util.Utils: Successfully started service 'HTTP file server' on port 59371.
16/08/02 12:18:47 INFO spark.SparkContext: Added JAR file:/home/ec2-user/simple-project-1.0.jar at http://172.31.11.100:59371/jars/simple-project-1.0.jar with timestamp 1470140327032
16/08/02 12:18:47 INFO client.AppClient$ClientEndpoint: Connecting to master spark://ec2-54-183-242-177.us-west-1.compute.amazonaws.com:7077...
16/08/02 12:18:47 WARN client.AppClient$ClientEndpoint: Failed to connect to master ec2-54-183-242-177.us-west-1.compute.amazonaws.com:7077
java.io.IOException: Failed to connect to ec2-54-183-242-177.us-west-1.compute.amazonaws.com/172.31.11.100:7077
    at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:216)
    at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:167)
    at org.apache.spark.rpc.netty.NettyRpcEnv.createClient(NettyRpcEnv.scala:200)
    at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:187)
    at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:183)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.ConnectException: Verbindungsaufbau abgelehnt: ec2-54-183-242-177.us-west-1.compute.amazonaws.com/172.31.11.100:7077
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:744)
    at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:224)
    at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:289)
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:528)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
    at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
    ... 1 more
16/08/02 12:19:07 INFO client.AppClient$ClientEndpoint: Connecting to master spark://ec2-54-183-242-177.us-west-1.compute.amazonaws.com:7077...
16/08/02 12:19:07 WARN client.AppClient$ClientEndpoint: Failed to connect to master ec2-54-183-242-177.us-west-1.compute.amazonaws.com:7077
java.io.IOException: Failed to connect to ec2-54-183-242-177.us-west-1.compute.amazonaws.com/172.31.11.100:7077
    at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:216)
    at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:167)
    at org.apache.spark.rpc.netty.NettyRpcEnv.createClient(NettyRpcEnv.scala:200)
    at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:187)
    at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:183)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.ConnectException: Verbindungsaufbau abgelehnt: ec2-54-183-242-177.us-west-1.compute.amazonaws.com/172.31.11.100:7077
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:744)
    at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:224)
    at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:289)
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:528)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
    at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
    ... 1 more
16/08/02 12:19:27 INFO client.AppClient$ClientEndpoint: Connecting to master spark://ec2-54-183-242-177.us-west-1.compute.amazonaws.com:7077...
16/08/02 12:19:27 INFO client.AppClient$ClientEndpoint: Connecting to master spark://ec2-54-183-242-177.us-west-1.compute.amazonaws.com:7077...
16/08/02 12:19:27 WARN client.AppClient$ClientEndpoint: Failed to connect to master ec2-54-183-242-177.us-west-1.compute.amazonaws.com:7077
java.io.IOException: Failed to connect to ec2-54-183-242-177.us-west-1.compute.amazonaws.com/172.31.11.100:7077
    at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:216)
    at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:167)
    at org.apache.spark.rpc.netty.NettyRpcEnv.createClient(NettyRpcEnv.scala:200)
    at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:187)
    at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:183)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.ConnectException: Verbindungsaufbau abgelehnt: ec2-54-183-242-177.us-west-1.compute.amazonaws.com/172.31.11.100:7077
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:744)
    at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:224)
    at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:289)
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:528)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
    at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
    ... 1 more
16/08/02 12:19:47 INFO client.AppClient$ClientEndpoint: Connecting to master spark://ec2-54-183-242-177.us-west-1.compute.amazonaws.com:7077...
16/08/02 12:19:47 ERROR cluster.SparkDeploySchedulerBackend: Application has been killed. Reason: All masters are unresponsive! Giving up.
16/08/02 12:19:47 INFO client.AppClient$ClientEndpoint: Connecting to master spark://ec2-54-183-242-177.us-west-1.compute.amazonaws.com:7077...
16/08/02 12:19:47 WARN cluster.SparkDeploySchedulerBackend: Application ID is not initialized yet.
16/08/02 12:19:47 WARN client.AppClient$ClientEndpoint: Failed to connect to master ec2-54-183-242-177.us-west-1.compute.amazonaws.com:7077
java.io.IOException: Failed to connect to ec2-54-183-242-177.us-west-1.compute.amazonaws.com/172.31.11.100:7077
    at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:216)
    at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:167)
    at org.apache.spark.rpc.netty.NettyRpcEnv.createClient(NettyRpcEnv.scala:200)
    at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:187)
    at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:183)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.ConnectException: Verbindungsaufbau abgelehnt: ec2-54-183-242-177.us-west-1.compute.amazonaws.com/172.31.11.100:7077
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:744)
    at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:224)
    at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:289)
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:528)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
    at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
    ... 1 more
16/08/02 12:19:47 INFO util.Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 52691.
16/08/02 12:19:47 INFO netty.NettyBlockTransferService: Server created on 52691
16/08/02 12:19:47 INFO storage.BlockManagerMaster: Trying to register BlockManager
16/08/02 12:19:47 INFO storage.BlockManagerMasterEndpoint: Registering block manager 172.31.11.100:52691 with 511.5 MB RAM, BlockManagerId(driver, 172.31.11.100, 52691)
16/08/02 12:19:47 INFO storage.BlockManagerMaster: Registered BlockManager
16/08/02 12:19:47 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage/kill,null}
16/08/02 12:19:47 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/api,null}
16/08/02 12:19:47 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/,null}
16/08/02 12:19:47 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/static,null}
16/08/02 12:19:47 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/threadDump/json,null}
16/08/02 12:19:47 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/threadDump,null}
16/08/02 12:19:47 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/json,null}
16/08/02 12:19:47 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors,null}
16/08/02 12:19:47 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/environment/json,null}
16/08/02 12:19:47 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/environment,null}
16/08/02 12:19:47 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/rdd/json,null}
16/08/02 12:19:47 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/rdd,null}
16/08/02 12:19:47 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/json,null}
16/08/02 12:19:47 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage,null}
16/08/02 12:19:47 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/pool/json,null}
16/08/02 12:19:47 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/pool,null}
16/08/02 12:19:47 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage/json,null}
16/08/02 12:19:47 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage,null}
16/08/02 12:19:47 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/json,null}
16/08/02 12:19:47 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages,null}
16/08/02 12:19:47 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/job/json,null}
16/08/02 12:19:47 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/job,null}
16/08/02 12:19:47 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/json,null}
16/08/02 12:19:47 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs,null}
16/08/02 12:19:47 INFO ui.SparkUI: Stopped Spark web UI at http://ec2-54-153-24-33.us-west-1.compute.amazonaws.com:4040
16/08/02 12:19:47 INFO cluster.SparkDeploySchedulerBackend: Shutting down all executors
16/08/02 12:19:47 INFO cluster.SparkDeploySchedulerBackend: Asking each executor to shut down
16/08/02 12:19:47 WARN client.AppClient$ClientEndpoint: Drop UnregisterApplication(null) because has not yet connected to master
16/08/02 12:19:47 ERROR util.SparkUncaughtExceptionHandler: Uncaught exception in thread Thread[appclient-registration-retry-thread,5,main]
java.lang.InterruptedException
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1038)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1326)
    at scala.concurrent.impl.Promise$DefaultPromise.tryAwait(Promise.scala:208)
    at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:218)
    at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
    at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)
    at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
    at scala.concurrent.Await$.result(package.scala:107)
    at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
    at org.apache.spark.deploy.client.AppClient.stop(AppClient.scala:290)
    at org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend.org$apache$spark$scheduler$cluster$SparkDeploySchedulerBackend$$stop(SparkDeploySchedulerBackend.scala:198)
    at org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend.stop(SparkDeploySchedulerBackend.scala:101)
    at org.apache.spark.scheduler.TaskSchedulerImpl.stop(TaskSchedulerImpl.scala:446)
    at org.apache.spark.scheduler.DAGScheduler.stop(DAGScheduler.scala:1582)
    at org.apache.spark.SparkContext$$anonfun$stop$9.apply$mcV$sp(SparkContext.scala:1740)
    at org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1229)
    at org.apache.spark.SparkContext.stop(SparkContext.scala:1739)
    at org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend.dead(SparkDeploySchedulerBackend.scala:127)
    at org.apache.spark.deploy.client.AppClient$ClientEndpoint.markDead(AppClient.scala:264)
    at org.apache.spark.deploy.client.AppClient$ClientEndpoint$$anon$2$$anonfun$run$1.apply$mcV$sp(AppClient.scala:134)
    at org.apache.spark.util.Utils$.tryOrExit(Utils.scala:1163)
    at org.apache.spark.deploy.client.AppClient$ClientEndpoint$$anon$2.run(AppClient.scala:129)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
16/08/02 12:19:47 INFO storage.DiskBlockManager: Shutdown hook called
16/08/02 12:19:47 INFO util.ShutdownHookManager: Shutdown hook called
16/08/02 12:19:47 INFO util.ShutdownHookManager: Deleting directory /mnt/spark/spark-12fdcf09-fcfc-4bf6-98d3-ec1f27d21345/userFiles-7ddf41a5-7328-4bdd-afcd-a4610404ecac
16/08/02 12:19:47 INFO util.ShutdownHookManager: Deleting directory /mnt2/spark/spark-5991f32e-20ef-4433-8de7-44ad57c53d97
16/08/02 12:19:47 INFO util.ShutdownHookManager: Deleting directory /mnt/spark/spark-12fdcf09-fcfc-4bf6-98d3-ec1f27d21345
16/08/02 12:19:47 INFO util.ShutdownHookManager: Deleting directory /mnt/spark/spark-12fdcf09-fcfc-4bf6-98d3-ec1f27d21345/httpd-da6f3d59-bc33-4a06-bac9-cb0c27fd82d9

1 个答案:

答案 0 :(得分:-1)

如果您没有将YARN或mesos用作cluster managers,即独立模式,则必须使用spark-submit将应用程序逐个部署到每个群集。

在每个群集上使用SSH在本地部署应用程序(local[n])都没问题,假设您在创建独立群集模式时已经正确配置了主服务器和从服务器。

回答第二个问题,local指令只允许您设置应用程序应在每个群集上运行多少threadsn是线程数。因此,它与是否将在一个或多个集群上运行无关。

因此,如果您使用spark-submit在本地,通过SSH将应用程序部署到所有群集(主服务器和从服务器),并且具有正确的Standalone设置,则应用程序应在所有群集上运行。