我在执行以下脚本时遇到了这个问题
./ spark-submit /home/*****/public_html/****/****.py
我首先使用python3.7.2和更高版本的python3.5.2,但仍然收到以下错误消息。
Exception in thread "main" java.io.IOException: Cannot run program "": error=2, No such a file or directory.
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048)
at org.apache.spark.deploy.PythonRunner$.main(PythonRunner.scala:100)
at org.apache.spark.deploy.PythonRunner.main(PythonRunner.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:894)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.io.IOException: error=2, No such a file or directory
at java.lang.UNIXProcess.forkAndExec(Native Method)
at java.lang.UNIXProcess.<init>(UNIXProcess.java:247)
at java.lang.ProcessImpl.start(ProcessImpl.java:134)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)... 12 more`
在此之前,我有几个消息输出
2019-02-07 11:30:18 WARN Utils:66 - Your hostname, localhost.localdomain resolves to a loopback address: 127.0.0.1; using xxx.xxx.xxx.xxx instead (on interface eth0)
2019-02-07 11:30:18 WARN Utils:66 - Set SPARK_LOCAL_IP if you need to bind to another address
2019-02-07 11:30:19 WARN NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
我能够执行python3 -V 我能够启动spark-shell和pyspark
我很奇怪在“”之间没有显示任何消息。
对于我的python代码,它以
开头import sys
import urllib3
import requests
from pyspark import SparkContext
from pyspark.sql import SQLContext
from pyspark.sql.types import StructType, StructField
from pyspark.sql.types import DoubleType, IntegerType, StringType
from CommonFunctions import *
from LanguageCodeParser import *
我还尝试了一个非常简单的python代码
print("This is a test.")
执行bash -x spark-submit test.py
+ '[' -z /opt/spark-2.3.2-bin-hadoop2.7 ']'
+ export PYTHONHASHSEED=0
+ PYTHONHASHSEED=0
+ exec /opt/spark-2.3.2-bin-hadoop2.7/bin/spark-class org.apache.spark.deploy.SparkSubmit test.py
但是,它不起作用。谢谢您的帮助。
答案 0 :(得分:0)
我发现设置PYSPARK_PYTHON = / usr / bin / python3很有用
如果可以在
中设置此环境变量,那就太好了/opt/spark-2.3.2-bin-hadoop2.7/conf/spark-env.sh