大家好我正在使用PySpark Python,我已经提到了代码并遇到了一些问题,我想知道是否有人知道以下问题?
import io
with io.open('Workbook2.csv', 'r', encoding='utf8') as infile:
ipFile = csv.DictReader((x.replace(u"\uFEFF", u" ") for x in infile))
....
这是我的一段代码,它将返回bool值为true false,当我第一次运行此代码时,它工作正常,但重新启动内核后,这就是我收到错误。
windowSpec = Window.partitionBy(df_Broadcast['id']).orderBy(df_Broadcast['id'])
windowSpec
IdShift = lag(df_Broadcast["id"]).over(windowSpec).alias('IdShift')
df_Broadcast = df_Broadcast.withColumn('CheckId', df_Broadcast[idI'] != IdShift)
df_Broadcast.show()
答案 0 :(得分:8)
错误是
引起:java.lang.OutOfMemoryError:Java堆空间
您需要更多内存来执行操作并避免OOM错误。
答案 1 :(得分:0)
此问题是JAVA版本的原因。我有spark 2.3.3和JAVA11。我已经删除了JAVA 11并安装了JAVA8。
问题已解决..
*********@*********-VirtualBox:/opt/java$ sudo update-alternatives --install "/usr/bin/java" "java" "/opt/java/jdk1.8.0_202/"
update-alternatives: --install needs <link> <name> <path> <priority>
Use 'update-alternatives --help' for program usage information.
*********@*********-VirtualBox:/opt/java$ sudo update-alternatives --install "/usr/bin/java" "java" "/opt/java/jdk1.8.0_202/" 1
update-alternatives: using /opt/java/jdk1.8.0_202/ to provide /usr/bin/java (java) in auto mode
*********@*********-VirtualBox:/opt/java$ sudo update-alternatives --install "/usr/bin/java" "java" "/opt/java/jdk1.8.0_202/bin/java" 1
*********@*********-VirtualBox:/opt/java$ sudo update-alternatives --install "/usr/bin/javac" "javac" "/opt/java/jdk1.8.0_202/bin/javac" 1
update-alternatives: using /opt/java/jdk1.8.0_202/bin/javac to provide /usr/bin/javac (javac) in auto mode
*********@*********-VirtualBox:/opt/java$ java -version
Command 'java' not found, but can be installed with:
sudo apt install default-jre
sudo apt install openjdk-11-jre-headless
sudo apt install openjdk-8-jre-headless
*********@*********-VirtualBox:/opt/java$ sudo update-alternatives --install "/usr/bin/javaws" "javaws" "/opt/java/jdk1.8.0_202/bin/javaws" 1
update-alternatives: using /opt/java/jdk1.8.0_202/bin/javaws to provide /usr/bin/javaws (javaws) in auto mode
*********@*********-VirtualBox:/opt/java$ sudo update-alternatives --install "/usr/bin/jar" "jar" "/opt/java/jdk1.8.0_202/bin/jar" 1
update-alternatives: using /opt/java/jdk1.8.0_202/bin/jar to provide /usr/bin/jar (jar) in auto mode
*********@*********-VirtualBox:/opt/java$ sudo update-alternatives --set "java" "/opt/java/jdk1.8.0_202/bin/java"
update-alternatives: using /opt/java/jdk1.8.0_202/bin/java to provide /usr/bin/java (java) in manual mode
*********@*********-VirtualBox:/opt/java$ sudo update-alternatives --set "javac" "/opt/java/jdk1.8.0_202/bin/javac"
*********@*********-VirtualBox:/opt/java$ sudo update-alternatives --set "javaws" "/opt/java/jdk1.8.0_202/bin/javaws"
*********@*********-VirtualBox:/opt/java$ sudo update-alternatives --set "jar" "/opt/java/jdk1.8.0_202/bin/jar"
*********@*********-VirtualBox:/opt/java$ cd
*********@*********-VirtualBox:~$ java -version
java version "1.8.0_202"
Java(TM) SE Runtime Environment (build 1.8.0_202-b08)
Java HotSpot(TM) 64-Bit Server VM (build 25.202-b08, mixed mode)