提交火花和相关配置的问题

时间:2020-01-04 15:11:42

标签: python apache-spark pyspark spark-submit

我按如下所示设置spark-env.sh:

SPARK_MASTER_HOST='192.168.1.125'
SPARK_MASTER_PORT=8888
SPARK_MASTER_WEBUI_PORT=9999
SPARK_WORKER_PORT=6666
SPARK_WORKER_WEBUI_PORT=7777

其余均为默认值,不会更改。

我的pyspark文件上下文如下:

from pyspark.context import SparkContext
sc = SparkContext('spark://192.168.1.125:8888', 'PythonPi')

我使用以下命令启动了1个spark-master和1个spark-slave:

./start-master.sh
./start-slave.sh spark://192.168.1.125:8888

我可以在localhost:9999网页上看到带有活着工人的主服务器,并且可以从运行命令

看到以下内容
ps -ef | grep -i spark

  502   874     1   0  6:44AM ttys001    0:09.32 /Library/Java/JavaVirtualMachines/jdk1.8.0_221.jdk/Contents/Home/jre/bin/java -cp /usr/local/opt/apache-spark/libexec/conf/:/usr/local/opt/apache-spark/libexec/jars/* -Xmx1g org.apache.spark.deploy.master.Master --host 192.168.1.125 --port 8888 --webui-port 9999
  502  1007     1   0  6:44AM ttys001    0:05.99 /Library/Java/JavaVirtualMachines/jdk1.8.0_221.jdk/Contents/Home/jre/bin/java -cp /usr/local/opt/apache-spark/libexec/conf/:/usr/local/opt/apache-spark/libexec/jars/* -Xmx1g org.apache.spark.deploy.worker.Worker --webui-port 7777 --port 6666 spark://192.168.1.125:8888
  502  1551   454   0  7:03AM ttys001    0:00.00 grep -i spark

我相信我已经成功完成了简单的Spark集群设置。但是,当我尝试在终端上使用以下命令提交pyspark文件时,spark-submit pi.py 错误消息显示如下:

########0 s, before SparkContext
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
20/01/04 06:57:44 INFO SparkContext: Running Spark version 2.2.1
20/01/04 06:57:44 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
20/01/04 06:57:44 INFO SparkContext: Submitted application: PythonPi
20/01/04 06:57:44 INFO SecurityManager: Changing view acls to: leslie
20/01/04 06:57:44 INFO SecurityManager: Changing modify acls to: leslie
20/01/04 06:57:44 INFO SecurityManager: Changing view acls groups to: 
20/01/04 06:57:44 INFO SecurityManager: Changing modify acls groups to: 
20/01/04 06:57:44 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(leslie); groups with view permissions: Set(); users  with modify permissions: Set(leslie); groups with modify permissions: Set()
20/01/04 06:57:45 INFO Utils: Successfully started service 'sparkDriver' on port 50215.
20/01/04 06:57:45 INFO SparkEnv: Registering MapOutputTracker
20/01/04 06:57:45 INFO SparkEnv: Registering BlockManagerMaster
20/01/04 06:57:45 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
20/01/04 06:57:45 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
20/01/04 06:57:45 INFO DiskBlockManager: Created local directory at /private/var/folders/yx/x8d6hdjs5vn8v3c96vjhj5d80000gp/T/blockmgr-77e7cf86-1075-4ab8-999c-fb9170d72ec2
20/01/04 06:57:45 INFO MemoryStore: MemoryStore started with capacity 366.3 MB
20/01/04 06:57:45 INFO SparkEnv: Registering OutputCommitCoordinator
20/01/04 06:57:45 INFO Utils: Successfully started service 'SparkUI' on port 4040.
20/01/04 06:57:45 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://192.168.1.125:4040
20/01/04 06:57:45 INFO SparkContext: Added file file:/Users/leslie/workspace/apple-private-cloud/py-spark/test/pi.py at spark://192.168.1.125:50215/files/pi.py with timestamp 1578149865654
20/01/04 06:57:45 INFO Utils: Copying /Users/leslie/workspace/apple-private-cloud/py-spark/test/pi.py to /private/var/folders/yx/x8d6hdjs5vn8v3c96vjhj5d80000gp/T/spark-01df1816-a6f6-4d21-97f8-af20ff6e5ad9/userFiles-2d5fdac7-59fe-40ab-9478-cf0d4918ff72/pi.py
20/01/04 06:57:45 INFO StandaloneAppClient$ClientEndpoint: Connecting to master spark://192.168.1.125:8888...
20/01/04 06:57:45 INFO TransportClientFactory: Successfully created connection to /192.168.1.125:8888 after 43 ms (0 ms spent in bootstraps)
20/01/04 06:58:05 INFO StandaloneAppClient$ClientEndpoint: Connecting to master spark://192.168.1.125:8888...
20/01/04 06:58:25 INFO StandaloneAppClient$ClientEndpoint: Connecting to master spark://192.168.1.125:8888...
20/01/04 06:58:45 ERROR StandaloneSchedulerBackend: Application has been killed. Reason: All masters are unresponsive! Giving up.
20/01/04 06:58:45 WARN StandaloneSchedulerBackend: Application ID is not initialized yet.
20/01/04 06:58:45 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 50242.
20/01/04 06:58:45 INFO NettyBlockTransferService: Server created on 192.168.1.125:50242
20/01/04 06:58:45 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
20/01/04 06:58:45 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 192.168.1.125, 50242, None)
20/01/04 06:58:45 INFO SparkUI: Stopped Spark web UI at http://192.168.1.125:4040
20/01/04 06:58:45 INFO BlockManagerMasterEndpoint: Registering block manager 192.168.1.125:50242 with 366.3 MB RAM, BlockManagerId(driver, 192.168.1.125, 50242, None)
20/01/04 06:58:45 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 192.168.1.125, 50242, None)
20/01/04 06:58:45 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 192.168.1.125, 50242, None)
20/01/04 06:58:45 INFO StandaloneSchedulerBackend: Shutting down all executors
20/01/04 06:58:45 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Asking each executor to shut down
20/01/04 06:58:45 WARN StandaloneAppClient$ClientEndpoint: Drop UnregisterApplication(null) because has not yet connected to master
20/01/04 06:58:45 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
20/01/04 06:58:45 INFO MemoryStore: MemoryStore cleared
20/01/04 06:58:45 INFO BlockManager: BlockManager stopped
20/01/04 06:58:45 ERROR SparkContext: Error initializing SparkContext.
java.lang.NullPointerException
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:567)
    at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)
    at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
    at py4j.Gateway.invoke(Gateway.java:236)
    at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
    at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
    at py4j.GatewayConnection.run(GatewayConnection.java:214)
    at java.lang.Thread.run(Thread.java:748)
20/01/04 06:58:45 INFO SparkContext: SparkContext already stopped.
Traceback (most recent call last):
  File "/Users/leslie/workspace/apple-private-cloud/py-spark/test/pi.py", line 86, in <module>
    sc = SparkContext('spark://192.168.1.125:8888', 'PythonPi')
  File "/usr/local/lib/python3.7/site-packages/pyspark/python/lib/pyspark.zip/pyspark/context.py", line 118, in __init__
  File "/usr/local/lib/python3.7/site-packages/pyspark/python/lib/pyspark.zip/pyspark/context.py", line 180, in _do_init
  File "/usr/local/lib/python3.7/site-packages/pyspark/python/lib/pyspark.zip/pyspark/context.py", line 273, in _initialize_context
  File "/usr/local/lib/python3.7/site-packages/pyspark/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py", line 1401, in __call__
  File "/usr/local/lib/python3.7/site-packages/pyspark/python/lib/py4j-0.10.4-src.zip/py4j/protocol.py", line 319, in get_return_value
py4j.protocol.Py4JJavaError20/01/04 06:58:45 INFO BlockManagerMaster: BlockManagerMaster stopped
: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext.
: java.lang.NullPointerException
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:567)
    at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)
    at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
    at py4j.Gateway.invoke(Gateway.java:236)
    at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
    at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
    at py4j.GatewayConnection.run(GatewayConnection.java:214)
    at java.lang.Thread.run(Thread.java:748)

20/01/04 06:58:45 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
20/01/04 06:58:45 INFO SparkContext: Successfully stopped SparkContext
20/01/04 06:58:45 INFO ShutdownHookManager: Shutdown hook called
20/01/04 06:58:45 INFO ShutdownHookManager: Deleting directory /private/var/folders/yx/x8d6hdjs5vn8v3c96vjhj5d80000gp/T/spark-01df1816-a6f6-4d21-97f8-af20ff6e5ad9

我也尝试使用 spark-submit pi.py --master spark://192.168.1.125:8888, 出现相同的错误。 任何帮助将不胜感激。

0 个答案:

没有答案