在Apache Spark独立模式下提交python示例应用程序时出错

时间:2016-11-26 01:57:50

标签: python apache-spark amazon-ec2

我正在尝试使用所有Amazon EC2 m4.large实例启动并运行Apache Spark群集。 我的设置目前如下:

  • 驱动程序EC2实例
  • Spark Master
  • 两个Spark Slaves

我在驱动程序上安装了Spark 2.0.2,并使用我从https://github.com/amplab/spark-ec2/tree/branch-2.0下载并安装的spark-ec2脚本开发了集群

我可以在浏览器上看到我的Spark Master的Web UI,以及它上面的两个从属节点。

所以现在当我尝试使用建议here的以下命令从我的$ SPARK_HOME提交示例应用程序examples / src / main / python / pi.py时:

./bin/spark-submit --master spark://ip-172-31-43-158.ec2.internal:7077 examples/src/main/python/pi.py 1000

我收到以下错误:

[ec2-user@ip-172-31-48-106 spark]$ ./bin/spark-submit --master spark://ip-172-31-43-158.ec2.internal:7077 examples/src/main/python/pi.py 1000
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
16/11/26 01:37:59 INFO SparkContext: Running Spark version 2.0.2
16/11/26 01:37:59 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/11/26 01:37:59 INFO SecurityManager: Changing view acls to: ec2-user
16/11/26 01:37:59 INFO SecurityManager: Changing modify acls to: ec2-user
16/11/26 01:37:59 INFO SecurityManager: Changing view acls groups to:
16/11/26 01:37:59 INFO SecurityManager: Changing modify acls groups to:
16/11/26 01:37:59 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(ec2-user); groups with view permissions: Set(); users  with modify permissions: Set(ec2-user); groups with modify permissions: Set()
16/11/26 01:38:00 INFO Utils: Successfully started service 'sparkDriver' on port 37830.
16/11/26 01:38:00 INFO SparkEnv: Registering MapOutputTracker
16/11/26 01:38:00 INFO SparkEnv: Registering BlockManagerMaster
16/11/26 01:38:00 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-f4b273d5-1d43-4e19-859a-3fb88fe527ff
16/11/26 01:38:00 INFO MemoryStore: MemoryStore started with capacity 366.3 MB
16/11/26 01:38:00 INFO SparkEnv: Registering OutputCommitCoordinator
16/11/26 01:38:00 INFO Utils: Successfully started service 'SparkUI' on port 4040.
16/11/26 01:38:00 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://172.31.48.106:4040
16/11/26 01:38:00 INFO SparkContext: Added file file:/opt/spark-2.0.2-bin-hadoop2.7/examples/src/main/python/pi.py at spark://172.31.48.106:37830/files/pi.py with timestamp 1480124280871
16/11/26 01:38:00 INFO Utils: Copying /opt/spark-2.0.2-bin-hadoop2.7/examples/src/main/python/pi.py to /tmp/spark-e16c0211-227d-45bb-866c-b38c84bd3344/userFiles-5e86cb15-a858-4421-9005-0d04364d2c68/pi.py
16/11/26 01:38:00 INFO StandaloneAppClient$ClientEndpoint: Connecting to master spark://ip-172-31-43-158.ec2.internal:7077...
16/11/26 01:38:20 INFO StandaloneAppClient$ClientEndpoint: Connecting to master spark://ip-172-31-43-158.ec2.internal:7077...
16/11/26 01:38:40 INFO StandaloneAppClient$ClientEndpoint: Connecting to master spark://ip-172-31-43-158.ec2.internal:7077...
16/11/26 01:39:00 ERROR StandaloneSchedulerBackend: Application has been killed. Reason: All masters are unresponsive! Giving up.

16/11/26 01:39:00 WARN StandaloneSchedulerBackend: Application ID is not initialized yet.
16/11/26 01:39:01 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 44219.
16/11/26 01:39:01 INFO NettyBlockTransferService: Server created on 172.31.48.106:44219
16/11/26 01:39:01 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 172.31.48.106, 44219)
16/11/26 01:39:01 INFO BlockManagerMasterEndpoint: Registering block manager 172.31.48.106:44219 with 366.3 MB RAM, BlockManagerId(driver, 172.31.48.106, 44219)
16/11/26 01:39:01 INFO SparkUI: Stopped Spark web UI at http://172.31.48.106:4040
16/11/26 01:39:01 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 172.31.48.106, 44219)
16/11/26 01:39:01 INFO StandaloneSchedulerBackend: Shutting down all executors
16/11/26 01:39:01 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Asking each executor to shut down
16/11/26 01:39:01 WARN StandaloneAppClient$ClientEndpoint: Drop UnregisterApplication(null) because has not yet connected to master
16/11/26 01:39:01 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
16/11/26 01:39:01 ERROR MapOutputTrackerMaster: Error communicating with MapOutputTracker
java.lang.InterruptedException
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1038)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1326)
        at scala.concurrent.impl.Promise$DefaultPromise.tryAwait(Promise.scala:208)
        at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:218)
        at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
        at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:190)
        at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
        at scala.concurrent.Await$.result(package.scala:190)
        at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:81)
        at org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:102)
        at org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:78)
        at org.apache.spark.MapOutputTracker.askTracker(MapOutputTracker.scala:100)
        at org.apache.spark.MapOutputTracker.sendTracker(MapOutputTracker.scala:110)
        at org.apache.spark.MapOutputTrackerMaster.stop(MapOutputTracker.scala:580)
        at org.apache.spark.SparkEnv.stop(SparkEnv.scala:84)
        at org.apache.spark.SparkContext$$anonfun$stop$11.apply$mcV$sp(SparkContext.scala:1797)
        at org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1290)
        at org.apache.spark.SparkContext.stop(SparkContext.scala:1796)
        at org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend.dead(StandaloneSchedulerBackend.scala:142)
        at org.apache.spark.deploy.client.StandaloneAppClient$ClientEndpoint.markDead(StandaloneAppClient.scala:254)
        at org.apache.spark.deploy.client.StandaloneAppClient$ClientEndpoint$$anon$2.run(StandaloneAppClient.scala:131)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
16/11/26 01:39:01 ERROR Utils: Uncaught exception in thread appclient-registration-retry-thread
org.apache.spark.SparkException: Error communicating with MapOutputTracker
        at org.apache.spark.MapOutputTracker.askTracker(MapOutputTracker.scala:104)
        at org.apache.spark.MapOutputTracker.sendTracker(MapOutputTracker.scala:110)
        at org.apache.spark.MapOutputTrackerMaster.stop(MapOutputTracker.scala:580)
        at org.apache.spark.SparkEnv.stop(SparkEnv.scala:84)
        at org.apache.spark.SparkContext$$anonfun$stop$11.apply$mcV$sp(SparkContext.scala:1797)
        at org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1290)
        at org.apache.spark.SparkContext.stop(SparkContext.scala:1796)
        at org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend.dead(StandaloneSchedulerBackend.scala:142)
        at org.apache.spark.deploy.client.StandaloneAppClient$ClientEndpoint.markDead(StandaloneAppClient.scala:254)
        at org.apache.spark.deploy.client.StandaloneAppClient$ClientEndpoint$$anon$2.run(StandaloneAppClient.scala:131)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.InterruptedException
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1038)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1326)
        at scala.concurrent.impl.Promise$DefaultPromise.tryAwait(Promise.scala:208)
        at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:218)
        at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
        at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:190)
        at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
        at scala.concurrent.Await$.result(package.scala:190)
        at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:81)
        at org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:102)
        at org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:78)
        at org.apache.spark.MapOutputTracker.askTracker(MapOutputTracker.scala:100)
        ... 16 more
16/11/26 01:39:01 INFO SparkContext: Successfully stopped SparkContext
16/11/26 01:39:01 ERROR SparkContext: Error initializing SparkContext.
java.lang.NullPointerException
        at org.apache.spark.SparkContext.<init>(SparkContext.scala:546)
        at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
        at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:240)
        at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
        at py4j.Gateway.invoke(Gateway.java:236)
        at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
        at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
        at py4j.GatewayConnection.run(GatewayConnection.java:214)
        at java.lang.Thread.run(Thread.java:745)
16/11/26 01:39:01 INFO SparkContext: SparkContext already stopped.
Traceback (most recent call last):
  File "/opt/spark-2.0.2-bin-hadoop2.7/examples/src/main/python/pi.py", line 32, in <module>
    .appName("PythonPi")\
  File "/opt/spark/python/lib/pyspark.zip/pyspark/sql/session.py", line 169, in getOrCreate
  File "/opt/spark/python/lib/pyspark.zip/pyspark/context.py", line 294, in getOrCreate
  File "/opt/spark/python/lib/pyspark.zip/pyspark/context.py", line 115, in __init__
  File "/opt/spark/python/lib/pyspark.zip/pyspark/context.py", line 168, in _do_init
  File "/opt/spark/python/lib/pyspark.zip/pyspark/context.py", line 233, in _initialize_context
  File "/opt/spark/python/lib/py4j-0.10.3-src.zip/py4j/java_gateway.py", line 1401, in __call__
  File "/opt/spark/python/lib/py4j-0.10.3-src.zip/py4j/protocol.py", line 319, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext.
: java.lang.NullPointerException
        at org.apache.spark.SparkContext.<init>(SparkContext.scala:546)
        at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
        at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:240)
        at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
        at py4j.Gateway.invoke(Gateway.java:236)
        at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
        at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
        at py4j.GatewayConnection.run(GatewayConnection.java:214)
        at java.lang.Thread.run(Thread.java:745)

16/11/26 01:39:01 INFO DiskBlockManager: Shutdown hook called
16/11/26 01:39:01 INFO ShutdownHookManager: Shutdown hook called
16/11/26 01:39:01 INFO ShutdownHookManager: Deleting directory /tmp/spark-e16c0211-227d-45bb-866c-b38c84bd3344/userFiles-5e86cb15-a858-4421-9005-0d04364d2c68
16/11/26 01:39:01 INFO ShutdownHookManager: Deleting directory /tmp/spark-e16c0211-227d-45bb-866c-b38c84bd3344

This是我的Spark Master的Web UI的截图。

我出错的任何指示? 我真的很感激你的一些帮助!

1 个答案:

答案 0 :(得分:0)

好的,所以我傻了。我试图从我的驱动程序本身运行spark-submit。我原本应该通过SSH登录Master(按照建议here并在下面提到),然后运行命令。

  • 进入您下载的Spark版本中的ec2目录
  • 运行./spark-ec2 -k <keypair> -i <key-file> login <cluster-name>以SSH进入群集
  • 然后运行问题
  • 中提到的spark-submit命令