我已经提交了pyspark作业,但是花了一些时间后作业失败,并出现以下错误:
20/10/08 06:49:30 ERROR Client: Application diagnostics message: Application application_1602138886042_0001 failed 2 times due to AM Container for appattempt_1602138886042_0001_000002 exited with exitCode: -104
Failing this attempt.Diagnostics: Container [pid=16756,containerID=container_1602138886042_0001_02_000001] is running beyond physical memory limits. Current usage: 1.6 GB of 1.5 GB physical memory used; 4.4 GB of 7.5 GB virtual memory used. Killing container.
Dump of the process-tree for container_1602138886042_0001_02_000001 :
|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
|- 16756 16754 16756 16756 (bash) 0 0 115871744 704 /bin/bash -c LD_LIBRARY_PATH="/usr/lib/hadoop/lib/native:/usr/lib/hadoop-lzo/lib/native:::/usr/lib/hadoop-lzo/lib/native:/usr/lib/hadoop/lib/native::/usr/lib/hadoop-lzo/lib/native:/usr/lib/hadoop/lib/native:/usr/lib/hadoop-lzo/lib/native:/usr/lib/hadoop/lib/native" /usr/lib/jvm/java-openjdk/bin/java -server -Xmx1024m -Djava.io.tmpdir=/mnt/yarn/usercache/hadoop/appcache/application_1602138886042_0001/container_1602138886042_0001_02_000001/tmp '-XX:+UseConcMarkSweepGC' '-XX:CMSInitiatingOccupancyFraction=70' '-XX:MaxHeapFreeRatio=70' '-XX:+CMSClassUnloadingEnabled' '-XX:OnOutOfMemoryError=kill -9 %p' -
要解决此内存问题,我厌倦了更改驱动程序和执行程序的内存设置,但工作仍然失败。下面是spark提交命令:
Args': ['spark-submit',
'--deploy-mode', 'cluster',
'--master', 'yarn',
'--executor-memory',
conf['emr_step_executor_memory'],
'--executor-cores',
conf['emr_step_executor_cores'],
'--conf',
'spark.yarn.submit.waitAppCompletion=true',
'--conf',
'spark.rpc.message.maxSize=1024',
'--conf',
'spark.driver.memoryOverhead=512',
'--conf',
'spark.executor.memoryOverhead=512',
'--conf',
'spark.driver.memory =2g',
'--conf',
'spark.driver.cores=2']
aws上的主机:c4.2xlarge AWS上的核心机器:c4.4xlarge
一件重要的事情数据即使少于50 MB也并不多。