为Pyspark打开Jupyter笔记本时出现问题

时间:2019-01-09 23:12:08

标签: python apache-spark pyspark anaconda jupyter-notebook

我安装了Anaconda3,Hadoop版本hadoop-2.7.7和spark版本spark-2.4.0-bin-hadoop2.7, 成功。

在终端中运行pyspark命令时,出现以下错误,而不是Jupyter笔记本浏览器。下面是我尝试过的代码。

laptop@laptop-Lenovo:~/spark-2.4.0-bin-hadoop2.7$ pyspark
[I 04:22:08.871 NotebookApp] JupyterLab extension loaded from /home/laptop/anaconda3/lib/python3.7/site-packages/jupyterlab
[I 04:22:08.871 NotebookApp] JupyterLab application directory is /home/laptop/anaconda3/share/jupyter/lab
[I 04:22:08.873 NotebookApp] Serving notebooks from local directory: /home/laptop/spark-2.4.0-bin-hadoop2.7
[I 04:22:08.873 NotebookApp] The Jupyter Notebook is running at:
[I 04:22:08.873 NotebookApp] http://localhost:8888/?token=cb87bf03bfac6184d49ddcb2f3fdbbc2a43ad76c14ed8364
[I 04:22:08.873 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[W 04:22:08.877 NotebookApp] No web browser found: could not locate runnable browser.
[C 04:22:08.877 NotebookApp] 

    To access the notebook, open this file in a browser:
        file:///run/user/1000/jupyter/nbserver-7600-open.html
    Or copy and paste one of these URLs:
        http://localhost:8888/?token=cb87bf03bfac6184d49ddcb2f3fdbbc2a43ad76c14ed8364

我将上面的链接 'http://localhost:8888/?token=cb87bf03bfac6184d49ddcb2f3fdbbc2a43ad76c14ed8364' 复制粘贴到firefox并输入,然后就可以在jupyter笔记本上处理spark命令了。

我安装了Chrome和firefox浏览器。

下面是.bashrc文件设置。

laptop@laptop-Lenovo:~$ vi ~/.bashrc

export JAVA_HOME=/usr/lib/jvm/java-8-oracle/jre
export PATH=$PATH:$JAVA_HOME/bin
export PATH=/home/laptop/anaconda3/bin:$PATH

export HADOOP_HOME=/home/laptop/hadoop-2.7.7
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export HADOOP_INSTALL=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib/native"

export SPARK_HOME=/home/laptop/spark-2.4.0-bin-hadoop2.7
export PATH=$PATH:/home/laptop/spark-2.4.0-bin-hadoop2.7/bin

export PYSPARK_PYTHON=/home/laptop/anaconda3/bin/python3
export PYSPARK_DRIVER_PYTHON=jupyter
export PYSPARK_DRIVER_PYTHON_OPTS='notebook'

spark-env.sh脚本文件:

laptop@laptop-Lenovo:~/spark-2.4.0-bin-hadoop2.7$ vi conf/spark-env.sh
export PYSPARK_PYTHON=/home/laptop/anaconda3/bin/python3
export PYSPARK_DRIVER_PYTHON=/home/laptop/anaconda3/bin/jupyter
export PYSPARK_DRIVER_PYTHON_OPTS='notebook'

我尝试更改变量值,但无法跟踪问题。

0 个答案:

没有答案