这样的代码 想要通过pyspark使用mlib / FPGrowth,但由于java服务器错误而出现问题,连接被拒绝,需要帮助!
spark=SparkSession.builder.\
appName('job1').\
master('local').\
getOrCreate()
sc=spark.sparkContext
rdd1=sc.textFile('/result/201810/*')
rdd2=rdd1.map(lambda line: line.strip().split(','))
# print rdd2.collect()
model = FPGrowth.train(rdd2, 0.4, 1)
然后引发如下错误:
ERROR:py4j.java_gateway:An error occurred while trying to connect to the Java server (127.0.0.1:59364)
Traceback (most recent call last):
File "/Users/zhaowei/spark/spark-2.2.1-bin-hadoop2.7/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py", line 963, in start
self.socket.connect((self.address, self.port))
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 228, in meth
return getattr(self._sock,name)(*args)
error: [Errno 61] Connection refused
有
1.export CLASSPATH
2.pip install py4j
3.copy py4j.jar /usr/share