Python ImportError:无法导入名称' Poller'在跑pyspark时

时间:2018-03-20 12:37:59

标签: python apache-spark pyspark

我已经提到this tut开始使用windows上的pyspark。这些是我遵循的步骤:

  1. here
  2. 下载的hadoop 2.7预装了火花
  3. 将spark-2.1.0-bin-hadoop2.7.tgz解压缩到环境变量中设置为%SPARK_HOME%的目录
  4. here
  5. 下载了winutils.exe
  6. %SPARK_HOME%\bin
  7. 中粘贴了winutils.exe
  8. %HADOOP_HOME%设置为与%SPARK_HOME%
  9. 相同的目录
  10. %PYSPARK_DRIVER_PYTHON%设为ipython
  11. %PYSPARK_DRIVER_PYTHON_OPTS%设置为笔记本
  12. ;%SPARK_HOME%\bin添加到%PATH%
  13. 但是当我跑步时

    > pyspark --master local[2]
    

    我收到以下错误:

      [TerminalIPythonApp] WARNING | Subcommand `ipython notebook` is deprecated and will be removed in future versions.
    [TerminalIPythonApp] WARNING | You likely want to use `jupyter notebook` in the future
    Traceback (most recent call last):
      File "d:\mahesh\softwares\python\winpython-64bit-3.4.4.4qt5\python-3.4.4.amd64\lib\runpy.py", line 170, in _run_module_as_main
        "__main__", mod_spec)
      File "d:\mahesh\softwares\python\winpython-64bit-3.4.4.4qt5\python-3.4.4.amd64\lib\runpy.py", line 85, in _run_code
        exec(code, run_globals)
      File "D:\mahesh\Softwares\python\WinPython-64bit-3.4.4.4Qt5\python-3.4.4.amd64\Scripts\ipython.exe\__main__.py", line 9, in <module>
      File "d:\mahesh\softwares\python\winpython-64bit-3.4.4.4qt5\python-3.4.4.amd64\lib\site-packages\IPython\__init__.py", line 125, in start_ipython
        return launch_new_instance(argv=argv, **kwargs)
      File "d:\mahesh\softwares\python\winpython-64bit-3.4.4.4qt5\python-3.4.4.amd64\lib\site-packages\traitlets\config\application.py", line 657, in launch_instance
        app.initialize(argv)
      File "<decorator-gen-113>", line 2, in initialize
      File "d:\mahesh\softwares\python\winpython-64bit-3.4.4.4qt5\python-3.4.4.amd64\lib\site-packages\traitlets\config\application.py", line 87, in catch_config_error
        return method(app, *args, **kwargs)
      File "d:\mahesh\softwares\python\winpython-64bit-3.4.4.4qt5\python-3.4.4.amd64\lib\site-packages\IPython\terminal\ipapp.py", line 308, in initialize
        super(TerminalIPythonApp, self).initialize(argv)
      File "<decorator-gen-7>", line 2, in initialize
      File "d:\mahesh\softwares\python\winpython-64bit-3.4.4.4qt5\python-3.4.4.amd64\lib\site-packages\traitlets\config\application.py", line 87, in catch_config_error
        return method(app, *args, **kwargs)
      File "d:\mahesh\softwares\python\winpython-64bit-3.4.4.4qt5\python-3.4.4.amd64\lib\site-packages\IPython\core\application.py", line 450, in initialize
        self.parse_command_line(argv)
      File "d:\mahesh\softwares\python\winpython-64bit-3.4.4.4qt5\python-3.4.4.amd64\lib\site-packages\IPython\terminal\ipapp.py", line 303, in parse_command_line
        return super(TerminalIPythonApp, self).parse_command_line(argv)
      File "<decorator-gen-4>", line 2, in parse_command_line
      File "d:\mahesh\softwares\python\winpython-64bit-3.4.4.4qt5\python-3.4.4.amd64\lib\site-packages\traitlets\config\application.py", line 87, in catch_config_error
        return method(app, *args, **kwargs)
      File "d:\mahesh\softwares\python\winpython-64bit-3.4.4.4qt5\python-3.4.4.amd64\lib\site-packages\traitlets\config\application.py", line 514, in parse_command_line
        return self.initialize_subcommand(subc, subargv)
      File "d:\mahesh\softwares\python\winpython-64bit-3.4.4.4qt5\python-3.4.4.amd64\lib\site-packages\IPython\core\application.py", line 243, in initialize_subcommand
        return super(BaseIPythonApplication, self).initialize_subcommand(subc, argv)
      File "<decorator-gen-3>", line 2, in initialize_subcommand
      File "d:\mahesh\softwares\python\winpython-64bit-3.4.4.4qt5\python-3.4.4.amd64\lib\site-packages\traitlets\config\application.py", line 87, in catch_config_error
        return method(app, *args, **kwargs)
      File "d:\mahesh\softwares\python\winpython-64bit-3.4.4.4qt5\python-3.4.4.amd64\lib\site-packages\traitlets\config\application.py", line 445, in initialize_subcommand
        subapp = import_item(subapp)
      File "d:\mahesh\softwares\python\winpython-64bit-3.4.4.4qt5\python-3.4.4.amd64\lib\site-packages\ipython_genutils\importstring.py", line 31, in import_item
        module = __import__(package, fromlist=[obj])
      File "d:\mahesh\softwares\python\winpython-64bit-3.4.4.4qt5\python-3.4.4.amd64\lib\site-packages\notebook\notebookapp.py", line 31, in <module>
        from zmq.eventloop import ioloop
      File "d:\mahesh\softwares\python\winpython-64bit-3.4.4.4qt5\python-3.4.4.amd64\lib\site-packages\zmq\eventloop\__init__.py", line 3, in <module>
        from zmq.eventloop.ioloop import IOLoop
      File "d:\mahesh\softwares\python\winpython-64bit-3.4.4.4qt5\python-3.4.4.amd64\lib\site-packages\zmq\eventloop\ioloop.py", line 21, in <module>
        from zmq import (
    ImportError: cannot import name 'Poller'
    
  14. 我能够使用>spark-shell命令正确运行spark scala shell。

    正如你在堆栈跟踪中看到的那样,我已经在路径

    安装了win-python
    D:\mahesh\Softwares\python\WinPython-64bit-3.4.4.4Qt5\python-3.4.4.amd64
    

    因此,我的%PYTHON_HOME%D:\mahesh\Softwares\python\WinPython-64bit-3.4.4.4Qt5。 但我的%SPARK_HOME%D:\mahesh\Programs\spark-2.3.0-bin-hadoop2.7。 运行where pyspark命令提供以下输出:

    D:\mahesh\Programs\spark-2.3.0-bin-hadoop2.7\bin\pyspark
    D:\mahesh\Programs\spark-2.3.0-bin-hadoop2.7\bin\pyspark.cmd
    D:\mahesh\Softwares\python\WinPython-64bit-3.4.4.4Qt5\python-3.4.4.amd64\Scripts\pyspark
    D:\mahesh\Softwares\python\WinPython-64bit-3.4.4.4Qt5\python-3.4.4.amd64\Scripts\pyspark.cmd
    

    我相信我的问题是我的Windows Spark环境有些错误配置。这就是我给出上述所有信息的原因。那么这里出了什么问题?

    请注意,我按照tut中的建议不使用Anaconda和GOW(Windows上的Gnu)执行了这些步骤。

1 个答案:

答案 0 :(得分:0)

将您的%PYSPARK_DRIVER_PYTHON%指向包含'Poller'所有依赖项的虚拟环境,然后进行检查。 否则你可以尝试在ipython环境中安装'Poller'(我坦率地不知道怎么做!)