在AWS上远程连接到spark群集的正确设置是什么?

时间:2016-09-04 08:27:04

标签: amazon-web-services apache-spark

我在AWS上有一个Spark群集,所有端口都打开了,我的笔记本电脑上运行了本地驱动程序编程。我得到以下痕迹,我不知道如何解决它?我相信我的驱动程序可以连接到master,因为以下设置

export SPARK_PUBLIC_DNS="52.44.36.224"
export SPARK_WORKER_CORES=6

然而,我的笔记本电脑上本地运行的驱动程序可能无法直接连接到worker / executor,因为AWS Spark集群返回私有IP,这些私有IP无法在AWS外部访问,因此我不确定如何修复它?

Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
16/09/04 01:22:17 INFO SparkContext: Running Spark version 2.0.0
16/09/04 01:22:17 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/09/04 01:22:18 INFO SecurityManager: Changing view acls to: kantkodali
16/09/04 01:22:18 INFO SecurityManager: Changing modify acls to: kantkodali
16/09/04 01:22:18 INFO SecurityManager: Changing view acls groups to: 
16/09/04 01:22:18 INFO SecurityManager: Changing modify acls groups to: 
16/09/04 01:22:18 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(kantkodali); groups with view permissions: Set(); users  with modify permissions: Set(kantkodali); groups with modify permissions: Set()
16/09/04 01:22:18 INFO Utils: Successfully started service 'sparkDriver' on port 55091.
16/09/04 01:22:18 INFO SparkEnv: Registering MapOutputTracker
16/09/04 01:22:18 INFO SparkEnv: Registering BlockManagerMaster
16/09/04 01:22:18 INFO DiskBlockManager: Created local directory at /private/var/folders/_6/lfxt933j3bd_xhq0m7dwm8s00000gn/T/blockmgr-cc8a4985-f9c0-4c1a-b17f-876e146cbd87
16/09/04 01:22:18 INFO MemoryStore: MemoryStore started with capacity 2004.6 MB
16/09/04 01:22:18 INFO SparkEnv: Registering OutputCommitCoordinator
16/09/04 01:22:19 INFO Utils: Successfully started service 'SparkUI' on port 4040.
16/09/04 01:22:19 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://192.168.0.191:4040
16/09/04 01:22:19 INFO StandaloneAppClient$ClientEndpoint: Connecting to master spark://52.44.36.224:7077...
16/09/04 01:22:19 INFO TransportClientFactory: Successfully created connection to /52.44.36.224:7077 after 89 ms (0 ms spent in bootstraps)
16/09/04 01:22:19 INFO StandaloneSchedulerBackend: Connected to Spark cluster with app ID app-20160904082232-0001
16/09/04 01:22:19 INFO StandaloneAppClient$ClientEndpoint: Executor added: app-20160904082232-0001/0 on worker-20160904003146-172.31.3.246-40675 (172.31.3.246:40675) with 2 cores
16/09/04 01:22:19 INFO StandaloneSchedulerBackend: Granted executor ID app-20160904082232-0001/0 on hostPort 172.31.3.246:40675 with 2 cores, 1024.0 MB RAM
16/09/04 01:22:19 INFO StandaloneAppClient$ClientEndpoint: Executor added: app-20160904082232-0001/1 on worker-20160904003205-172.31.3.245-35631 (172.31.3.245:35631) with 2 cores
16/09/04 01:22:19 INFO StandaloneSchedulerBackend: Granted executor ID app-20160904082232-0001/1 on hostPort 172.31.3.245:35631 with 2 cores, 1024.0 MB RAM
16/09/04 01:22:19 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 55093.
16/09/04 01:22:19 INFO NettyBlockTransferService: Server created on 192.168.0.191:55093
16/09/04 01:22:19 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 192.168.0.191, 55093)
16/09/04 01:22:19 INFO BlockManagerMasterEndpoint: Registering block manager 192.168.0.191:55093 with 2004.6 MB RAM, BlockManagerId(driver, 192.168.0.191, 55093)
16/09/04 01:22:19 INFO StandaloneAppClient$ClientEndpoint: Executor updated: app-20160904082232-0001/0 is now RUNNING
16/09/04 01:22:19 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 192.168.0.191, 55093)
16/09/04 01:22:19 INFO StandaloneAppClient$ClientEndpoint: Executor updated: app-20160904082232-0001/1 is now RUNNING
16/09/04 01:22:19 INFO StandaloneSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.0
16/09/04 01:22:20 INFO SparkContext: Starting job: start at Consumer.java:41
16/09/04 01:22:20 INFO DAGScheduler: Registering RDD 1 (start at Consumer.java:41)
16/09/04 01:22:20 INFO DAGScheduler: Got job 0 (start at Consumer.java:41) with 20 output partitions
16/09/04 01:22:20 INFO DAGScheduler: Final stage: ResultStage 1 (start at Consumer.java:41)
16/09/04 01:22:20 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 0)
16/09/04 01:22:20 INFO DAGScheduler: Missing parents: List(ShuffleMapStage 0)
16/09/04 01:22:20 INFO DAGScheduler: Submitting ShuffleMapStage 0 (MapPartitionsRDD[1] at start at Consumer.java:41), which has no missing parents
16/09/04 01:22:20 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 3.1 KB, free 2004.6 MB)
16/09/04 01:22:20 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 2001.0 B, free 2004.6 MB)
16/09/04 01:22:20 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 192.168.0.191:55093 (size: 2001.0 B, free: 2004.6 MB)
16/09/04 01:22:20 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1012
16/09/04 01:22:20 INFO DAGScheduler: Submitting 50 missing tasks from ShuffleMapStage 0 (MapPartitionsRDD[1] at start at Consumer.java:41)
16/09/04 01:22:20 INFO TaskSchedulerImpl: Adding task set 0.0 with 50 tasks
16/09/04 01:22:35 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
16/09/04 01:22:50 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
16/09/04 01:23:05 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
16/09/04 01:23:20 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
16/09/04 01:23:35 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
16/09/04 01:23:50 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
16/09/04 01:24:05 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
16/09/04 01:24:20 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
16/09/04 01:24:20 INFO StandaloneAppClient$ClientEndpoint: Executor updated: app-20160904082232-0001/1 is now EXITED (Command exited with code 1)
16/09/04 01:24:20 INFO StandaloneSchedulerBackend: Executor app-20160904082232-0001/1 removed: Command exited with code 1
16/09/04 01:24:20 INFO BlockManagerMaster: Removal of executor 1 requested
16/09/04 01:24:20 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Asked to remove non-existent executor 1
16/09/04 01:24:20 INFO StandaloneAppClient$ClientEndpoint: Executor added: app-20160904082232-0001/2 on worker-20160904003205-172.31.3.245-35631 (172.31.3.245:35631) with 1 cores
16/09/04 01:24:20 INFO StandaloneSchedulerBackend: Granted executor ID app-20160904082232-0001/2 on hostPort 172.31.3.245:35631 with 1 cores, 1024.0 MB RAM
16/09/04 01:24:20 INFO StandaloneAppClient$ClientEndpoint: Executor added: app-20160904082232-0001/3 on worker-20160904003146-172.31.3.246-40675 (172.31.3.246:40675) with 1 cores
16/09/04 01:24:20 INFO StandaloneSchedulerBackend: Granted executor ID app-20160904082232-0001/3 on hostPort 172.31.3.246:40675 with 1 cores, 1024.0 MB RAM

0 个答案:

没有答案