我正在尝试向远程hadoop集群提交spark应用程序。我正在关注文章here。
这是我提交工作的方式:
PYSPARK_PYTHON=./PY_ENV/prod_env3/bin/python /home/hadoop/spark-1.6.0-bin-hadoop2.6/bin/spark-submit \
--master yarn \
--name run.py \
--deploy-mode cluster \
--executor-memory 2g \
--executor-cores 1 \
--num-executors 3 \
--jars /home/hadoop/projects/cms_counter/spark-streaming-kafka-assembly_2.10-1.6.0.jar \
--conf spark.yarn.appMasterEnv.PYSPARK_PYTHON=./PY_ENV/prod_env3/bin/python \
--archives /opt/anaconda/envs/prod_env3.zip#PY_ENV \
/home/hadoop/run.py
在这里,我可以看到在hdfs上上传到 .sparkStaging / application_1490199711887_0131 的环境,脚本,jar:
17/03/27 11:55:42 INFO ConfiguredRMFailoverProxyProvider: Failing over to rm188
17/03/27 11:55:42 INFO Client: Requesting a new application from cluster with 3 NodeManagers
17/03/27 11:55:42 INFO Client: Verifying our application has not requested more than the maximum memory capability of the cluster (52586 MB per container)
17/03/27 11:55:42 INFO Client: Will allocate AM container, with 1408 MB memory including 384 MB overhead
17/03/27 11:55:42 INFO Client: Setting up container launch context for our AM
17/03/27 11:55:42 INFO Client: Setting up the launch environment for our AM container
17/03/27 11:55:42 INFO Client: Preparing resources for our AM container
17/03/27 11:55:43 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17/03/27 11:55:43 INFO Client: Uploading resource file:/home/hadoop/spark-1.6.0-bin-hadoop2.6/lib/spark-assembly-1.6.0-hadoop2.6.0.jar -> hdfs://nameservicehighavail/user/root/.sparkStaging/application_1490199711887_0131/spark-assembly-1.6.0-hadoop2.6.0.jar
17/03/27 11:55:45 INFO Client: Uploading resource file:/home/hadoop/projects/cms_counter/spark-streaming-kafka-assembly_2.10-1.6.0.jar -> hdfs://nameservicehighavail/user/root/.sparkStaging/application_1490199711887_0131/spark-streaming-kafka-assembly_2.10-1.6.0.jar
17/03/27 11:55:45 INFO Client: Uploading resource file:/opt/anaconda/envs/prod_env3.zip#PY_ENV -> hdfs://nameservicehighavail/user/root/.sparkStaging/application_1490199711887_0131/prod_env3.zip
17/03/27 11:55:46 INFO Client: Uploading resource file:/home/hadoop/run.py -> hdfs://nameservicehighavail/user/root/.sparkStaging/application_1490199711887_0131/run.py
17/03/27 11:55:46 INFO Client: Uploading resource file:/home/hadoop/spark-1.6.0-bin-hadoop2.6/python/lib/pyspark.zip -> hdfs://nameservicehighavail/user/root/.sparkStaging/application_1490199711887_0131/pyspark.zip
17/03/27 11:55:46 INFO Client: Uploading resource file:/home/hadoop/spark-1.6.0-bin-hadoop2.6/python/lib/py4j-0.9-src.zip -> hdfs://nameservicehighavail/user/root/.sparkStaging/application_1490199711887_0131/py4j-0.9-src.zip
17/03/27 11:55:46 INFO Client: Uploading resource file:/tmp/spark-7c8130fc-454f-4920-95ce-30211cea3576/__spark_conf__8359653165366110281.zip -> hdfs://nameservicehighavail/user/root/.sparkStaging/application_1490199711887_0131/__spark_conf__8359653165366110281.zip
17/03/27 11:55:46 INFO SecurityManager: Changing view acls to: root
17/03/27 11:55:46 INFO SecurityManager: Changing modify acls to: root
17/03/27 11:55:46 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
17/03/27 11:55:46 INFO Client: Submitting application 131 to ResourceManager
17/03/27 11:55:46 INFO YarnClientImpl: Submitted application application_1490199711887_0131
17/03/27 11:55:47 INFO Client: Application report for application_1490199711887_0131 (state: ACCEPTED)
我可以确认文件存在于此处:
[root@d83 ~]# hadoop fs -ls .sparkStaging/application_1490199711887_0131
Found 7 items
-rw-r--r-- 3 root supergroup 24766 2017-03-27 11:54 .sparkStaging/application_1490199711887_0131/__spark_conf__8359653165366110281.zip
-rw-r--r-- 3 root supergroup 36034763 2017-03-27 11:54 .sparkStaging/application_1490199711887_0131/prod_env3.zip
-rw-r--r-- 3 root supergroup 44846 2017-03-27 11:54 .sparkStaging/application_1490199711887_0131/py4j-0.9-src.zip
-rw-r--r-- 3 root supergroup 355358 2017-03-27 11:54 .sparkStaging/application_1490199711887_0131/pyspark.zip
-rw-r--r-- 3 root supergroup 2099 2017-03-27 11:54 .sparkStaging/application_1490199711887_0131/run.py
-rw-r--r-- 3 root supergroup 187548272 2017-03-27 11:54 .sparkStaging/application_1490199711887_0131/spark-assembly-1.6.0-hadoop2.6.0.jar
-rw-r--r-- 3 root supergroup 13350134 2017-03-27 11:54 .sparkStaging/application_1490199711887_0131/spark-streaming-kafka-assembly_2.10-1.6.0.jar
但仍有火花告诉我它无法找到给定的python环境。
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/data/sde/yarn/nm/usercache/root/filecache/1454/spark-assembly-1.6.0-hadoop2.6.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-5.10.0-1.cdh5.10.0.p0.41/jars/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
17/03/27 11:54:10 INFO ApplicationMaster: Registered signal handlers for [TERM, HUP, INT]
17/03/27 11:54:11 INFO ApplicationMaster: ApplicationAttemptId: appattempt_1490199711887_0131_000001
17/03/27 11:54:11 INFO SecurityManager: Changing view acls to: yarn,root
17/03/27 11:54:11 INFO SecurityManager: Changing modify acls to: yarn,root
17/03/27 11:54:11 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(yarn, root); users with modify permissions: Set(yarn, root)
17/03/27 11:54:12 INFO ApplicationMaster: Starting the user application in a separate Thread
17/03/27 11:54:12 INFO ApplicationMaster: Waiting for spark context initialization
17/03/27 11:54:12 INFO ApplicationMaster: Waiting for spark context initialization ...
17/03/27 11:54:12 ERROR ApplicationMaster: User class threw exception: java.io.IOException: Cannot run program "./PY_ENV/prod_env3/bin/python": error=2, No such file or directory
java.io.IOException: Cannot run program "./PY_ENV/prod_env3/bin/python": error=2, No such file or directory
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1047)
at org.apache.spark.deploy.PythonRunner$.main(PythonRunner.scala:82)
at org.apache.spark.deploy.PythonRunner.main(PythonRunner.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:542)
Caused by: java.io.IOException: error=2, No such file or directory
at java.lang.UNIXProcess.forkAndExec(Native Method)
at java.lang.UNIXProcess.<init>(UNIXProcess.java:186)
at java.lang.ProcessImpl.start(ProcessImpl.java:130)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1028)
... 7 more
17/03/27 11:54:12 INFO ApplicationMaster: Final app status: FAILED, exitCode: 15, (reason: User class threw exception: java.io.IOException: Cannot run program "./PY_ENV/prod_env3/bin/python": error=2, No such file or directory)
17/03/27 11:54:22 ERROR ApplicationMaster: SparkContext did not initialize after waiting for 100000 ms. Please check earlier log output for errors. Failing the application.
17/03/27 11:54:22 INFO ShutdownHookManager: Shutdown hook called
显然,我必须遗漏一些东西,任何领导都会受到赞赏。
答案 0 :(得分:0)
原来我压缩了错误的python环境(zip -r prod_env3.zip prod_env)。很抱歉给您带来不便。