如何获取/创建远程Spark master的会话?

时间:2018-02-21 20:29:05

标签: apache-spark pyspark

我试图从远程连接到主Spark,但是我收到了错误:"在向驱动程序发送端口号"之前退出了Java网关进程。

from pyspark.sql import SparkSession
master = "spark://192.168.56.102:7077"
SparkSession.builder.master(master).appName("spark session").getOrCreate()

Spark master

Spark Master在CentOS虚拟机(独立配置)中进行了调度。

修改

telnet

对于从Virtual-Box网络界面开始的套接字,我编辑了java_gateway.py文件(来自pyspark软件包)。第62-70行:

    # Start a socket that will be used by PythonGatewayServer to communicate its port to us
    callback_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)

    # >>>>>>>>>>>>>>>>
    #callback_socket.bind(('127.0.0.1', 0)) # commented
    callback_socket.bind(('192.168.56.1', 0)) # new line

    callback_socket.listen(1)
    callback_host, callback_port = callback_socket.getsockname()
    env = dict(os.environ)
    env['_PYSPARK_DRIVER_CALLBACK_HOST'] = callback_host
    env['_PYSPARK_DRIVER_CALLBACK_PORT'] = str(callback_port)

但是我在尝试连接时遇到了同样的错误:"在向驱动程序发送端口号"之前退出了Java网关进程。

0 个答案:

没有答案