我正在尝试关注Apache Spark文档站点上的示例:https://spark.apache.org/docs/2.0.0-preview/submitting-applications.html
我启动了一个Spark独立集群,并希望运行示例Python应用程序。我在我的spark-2.0.0-bin-hadoop2.7目录中并运行以下命令
./bin/spark-submit \
--master spark://207.184.161.138:7077 \
examples/src/main/python/pi.py \
1000
然而,我收到错误
jupyter: '/Users/MyName/spark-2.0.0-bin- \
hadoop2.7/examples/src/main/python/pi.py' is not a Jupyter command
这就是我的bash_profile看起来像
#setting path for Spark
export SPARK_PATH=~/spark-2.0.0-bin-hadoop2.7
export PYSPARK_DRIVER_PYTHON="jupyter"
export PYSPARK_DRIVER_PYTHON_OPTS="notebook"
alias snotebook='$SPARK_PATH/bin/pyspark --master local[2]'
我做错了什么?
答案 0 :(得分:1)
PYSPARK_DRIVER_PYTHON
和PYSPARK_DRIVER_PYTHON_OPTS
用于在打开pyspark shell时运行ipython / jupyter shell(How to load IPython shell with PySpark处的更多信息)。
您可以将其设置为:
alias snotebook='PYSPARK_DRIVER_PYTHON=jupyter PYSPARK_DRIVER_PYTHON_OPTS=notebook $SPARK_PATH/bin/pyspark --master local[2]'
这样在提交
时它不会干扰pyspark答案 1 :(得分:1)
在spark-submit命令之前添加PYSPARK_DRIVER_PYTHON=ipython
。
示例:
PYSPARK_DRIVER_PYTHON=ipython ./bin/spark-submit \
/home/SimpleApp.py