Spark内存错误Java运行时环境

时间:2017-05-29 10:20:51

标签: apache-spark pyspark pyspark-sql

我在python中创建了一个spark作业,在那里我从Redshift中检索数据,然后我应用了很多转换,join,filter,withColumn,agg ... 数据框中有大约30K条记录 我执行所有转换,当我尝试编写AVRO文件时,火花作业失败

我的火花提交:

Dim Location As Integer = Panel_Images.Location.Y

If Location + 20 < Panel_Images.VerticalScroll.Maximum Then

    Location += 20
    Panel_Images.VerticalScroll.Value = Location

Else

    'If scroll position is above 280 set the position to 280 (MAX)
    Location = Panel_Images.VerticalScroll.Maximum
    Panel_Images.AutoScrollPosition = New Point(0, Location)
End If

我正在使用--executor-memory 10G --driver-memory 14g,6台机器在亚马逊有8核和15G RAM,为什么我的内存出错了???

返回错误:

. /usr/bin/spark-submit --packages="com.databricks:spark-avro_2.11:3.2.0" --jars RedshiftJDBC42-1.2.1.1001.jar --deploy-mode client --master yarn --num-executors 10 --executor-cores 3 --executor-memory 10G --driver-memory 14g --conf spark.sql.broadcastTimeout=3600 --conf spark.network.timeout=10000000 --py-files dependencies.zip iface_extractions.py 2016-10-01 > output.log

这是火花日志的结束:

#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (mmap) failed to map 196608 bytes for committing reserved memory.
# An error report file with more information is saved as:
# /home/hadoop/hs_err_pid13688.log

0 个答案:

没有答案