运行pyspark mlib作业引发java_gateway错误

时间:2018-10-22 13:35:14

标签: apache-spark pyspark apache-spark-mllib py4j

这样的代码 想要通过pyspark使用mlib / FPGrowth,但由于java服务器错误而出现问题,连接被拒绝,需要帮助!

spark=SparkSession.builder.\
    appName('job1').\
    master('local').\
    getOrCreate()
sc=spark.sparkContext

rdd1=sc.textFile('/result/201810/*')

rdd2=rdd1.map(lambda line: line.strip().split(','))
# print rdd2.collect()
model = FPGrowth.train(rdd2, 0.4, 1)

然后引发如下错误:

ERROR:py4j.java_gateway:An error occurred while trying to connect to the Java server (127.0.0.1:59364)
Traceback (most recent call last):
  File "/Users/zhaowei/spark/spark-2.2.1-bin-hadoop2.7/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py", line 963, in start
    self.socket.connect((self.address, self.port))
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 228, in meth
    return getattr(self._sock,name)(*args)
error: [Errno 61] Connection refused

1.export CLASSPATH 
2.pip install py4j
3.copy py4j.jar /usr/share

0 个答案:

没有答案