我看到了同样的问题,但是答案不能解决我的问题: 运行以下PySpark程序时遇到错误。
操作系统:win10
PySpark版本:2.3.1
JAVA JDK:jdk-11.0.1
Java版本1.8.0_181 代码:
from pyspark.context import SparkContext
sc = SparkContext( 'local', 'test')
logFile = 'file:///F:/my_spark/README.md'
logData = sc.textFile(logFile, 2).cache()
print('logData:',logData)
numAs = logData.filter(lambda line: 'a' in line).count()
numBs = logData.filter(lambda line: 'b' in line).count()
print('Lines with a: %s, Lines with b: %s' % (numAs, numBs))
错误:
Traceback (most recent call last):
File 'F:/my_spark/testsecond.py', line 14, in <module>
numAs = logData.filter(lambda line: 'a' in line).count()
File 'C:\Users\17610\Anaconda3\lib\site-packages\pyspark\rdd.py', line 1053, in count
return self.mapPartitions(lambda i: [sum(1 for _ in i)]).sum()
py4j.protocol.Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.collectAndServe.
: java.lang.IllegalArgumentException
at org.apache.xbean.asm5.ClassReader.<init>(Unknown Source)
at org.apache.xbean.asm5.ClassReader.<init>(Unknown Source)
at org.apache.xbean.asm5.ClassReader.<init>(Unknown Source)
at org.apache.spark.util.ClosureCleaner$.getClassReader(ClosureCleaner.scala:46)
at org.apache.spark.util.FieldAccessFinder$$anon$3$$anonfun$visitMethodInsn$2.apply(ClosureCleaner.scala:449)
at org.apache.spark.util.FieldAccessFinder$$anon$3$$anonfun$visitMethodInsn$2.apply(ClosureCleaner.scala:432)
at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733)
at scala.collection.mutable.HashMap$$anon$1$$anonfun$foreach$2.apply(HashMap.scala:103)
at scala.collection.mutable.HashMap$$anon$1$$anonfun$foreach$2.apply(HashMap.scala:103)
at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:230)
at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
at scala.collection.mutable.HashMap$$anon$1.foreach(HashMap.scala:103)
at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732)