我正在对两个文件的内容进行块乘法。最后我尝试将结果写入文本文件,我收到以下错误,
Py4JJavaError: An error occurred while calling o426.saveAsTextFile
和
ValueError: could not convert string to float:
该计划:
import numpy as np
from pyspark import SparkContext, SparkConf
sc = SparkContext("local", "Simple App")
mat = sc.textFile("mat1.txt")
mat2 = sc.textFile("mat2.txt")
matFilter = mat.map(lambda x: [float(i) for i in x.split(" ")])
matFilter2 = mat2.map(lambda x: [float(i) for i in x.split(" ")])
matgroupp = matFilter.map(lambda x: (x[0], [x[2]])).reduceByKey(lambda p,q: p+q)
matgroup2 = matFilter2.map(lambda x: (x[1], [x[2]])).reduceByKey(lambda p,q: p+q)
matInter = matgroupp.cartesian(matgroup2)
matmul = matInter.map(lambda x: ((x[0][0], x[1][0]), np.dot(x[0][1], x[1][1]))).sortByKey(True)
matmul.saveAsTextFile("results/res.txt")
mat1.txt的内容
0 0 10.0
1 0 10.0
mat2.txt的内容
0 0 20.0
0 1 10.0