我在Spark上运行GraphX,在ws EMR上输入文件大小约为100GB。 我的群集配置如下 节点 - 10 内存 - 每个122GB 硬盘 - 每个320GB
无论我做什么,当我以
运行spark job时,我都会出现内存错误spark-submit --deploy-mode cluster \
--class com.news.ncg.report.graph.NcgGraphx \
ncgaka-graphx-assembly-1.0.jar true s3://<bkt>/<folder>/run=2016-08-19-02-06-20/part* output
错误
AM Container for appattempt_1474446853388_0001_000001 exited with exitCode: -104
For more detailed output, check application tracking page:http://ip-172-27-111-41.ap-southeast-2.compute.internal:8088/cluster/app/application_1474446853388_0001Then, click on links to logs of each attempt.
Diagnostics: Container [pid=7902,containerID=container_1474446853388_0001_01_000001] is running beyond physical memory limits. Current usage: 1.4 GB of 1.4 GB physical memory used; 3.4 GB of 6.9 GB virtual memory used. Killing container.
Dump of the process-tree for container_1474446853388_0001_01_000001 :
|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
|- 7907 7902 7902 7902 (java) 36828 2081 3522265088 359788 /usr/lib/jvm/java-openjdk/bin/java -server -Xmx1024m -Djava.io.tmpdir=/mnt/yarn/usercache/hadoop/appcache/application_1474446853388_0001/container_1474446853388_0001_01_000001/tmp -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -XX:MaxHeapFreeRatio=70 -XX:+CMSClassUnloadingEnabled -XX:OnOutOfMemoryError=kill -9 %p -Dspark.yarn.app.container.log.dir=/var/log/hadoop-yarn/containers/application_1474446853388_0001/container_1474446853388_0001_01_000001 org.apache.spark.deploy.yarn.ApplicationMaster --class com.news.ncg.report.graph.NcgGraphx --jar s3://discover-pixeltoucher/jar/ncgaka-graphx-assembly-1.0.jar --arg true --arg s3://discover-pixeltoucher/ncgus/run=2016-08-19-02-06-20/part* --arg s3://discover-pixeltoucher/output/20160819/ --properties-file /mnt/yarn/usercache/hadoop/appcache/application_1474446853388_0001/container_1474446853388_0001_01_000001/__spark_conf__/__spark_conf__.properties
|- 7902 7900 7902 7902 (bash) 0 0 115810304 687 /bin/bash -c LD_LIBRARY_PATH=/usr/lib/hadoop/lib/native:/usr/lib/hadoop-lzo/lib/native:::/usr/lib/hadoop-lzo/lib/native:/usr/lib/hadoop/lib/native::/usr/lib/hadoop-lzo/lib/native:/usr/lib/hadoop/lib/native:/usr/lib/hadoop-lzo/lib/native:/usr/lib/hadoop/lib/native /usr/lib/jvm/java-openjdk/bin/java -server -Xmx1024m -Djava.io.tmpdir=/mnt/yarn/usercache/hadoop/appcache/application_1474446853388_0001/container_1474446853388_0001_01_000001/tmp '-XX:+UseConcMarkSweepGC' '-XX:CMSInitiatingOccupancyFraction=70' '-XX:MaxHeapFreeRatio=70' '-XX:+CMSClassUnloadingEnabled' '-XX:OnOutOfMemoryError=kill -9 %p' -Dspark.yarn.app.container.log.dir=/var/log/hadoop-yarn/containers/application_1474446853388_0001/container_1474446853388_0001_01_000001 org.apache.spark.deploy.yarn.ApplicationMaster --class 'com.news.ncg.report.graph.NcgGraphx' --jar s3://discover-pixeltoucher/jar/ncgaka-graphx-assembly-1.0.jar --arg 'true' --arg 's3://discover-pixeltoucher/ncgus/run=2016-08-19-02-06-20/part*' --arg 's3://discover-pixeltoucher/output/20160819/' --properties-file /mnt/yarn/usercache/hadoop/appcache/application_1474446853388_0001/container_1474446853388_0001_01_000001/__spark_conf__/__spark_conf__.properties 1> /var/log/hadoop-yarn/containers/application_1474446853388_0001/container_1474446853388_0001_01_000001/stdout 2> /var/log/hadoop-yarn/containers/application_1474446853388_0001/container_1474446853388_0001_01_000001/stderr
Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143
Failing this attempt
知道如何才能停止收到此错误?
我创建了sparkSession,如下所示
val spark = SparkSession
.builder()
.master(mode)
.config("spark.hadoop.validateOutputSpecs", "false")
.config("spark.driver.cores", "1")
.config("spark.driver.memory", "30g")
.config("spark.executor.memory", "19g")
.config("spark.executor.cores", "5")
.config("spark.yarn.executor.memoryOverhead","2g")
.config("spark.yarn.driver.memoryOverhead ","1g")
.config("spark.shuffle.compress","true")
.config("spark.shuffle.service.enabled","true")
.config("spark.scheduler.mode","FAIR")
.config("spark.speculation","true")
.appName("NcgGraphX")
.getOrCreate()
答案 0 :(得分:1)
您似乎希望在YARN上部署Spark应用程序。如果是这种情况,则不应在代码中设置应用程序属性,而应使用spark-submit:
$ ./bin/spark-submit --class com.news.ncg.report.graph.NcgGraphx \
--master yarn \
--deploy-mode cluster \
--driver-memory 30g \
--executor-memory 19g \
--executor-cores 5 \
<other options>
ncgaka-graphx-assembly-1.0.jar true s3://<bkt>/<folder>/run=2016-08-19-02-06-20/part* output
在client
模式下,JVM已经设置好了,所以我个人会使用CLI来传递这些选项。
在spark-submit
中传递内存选项后,更改代码以动态加载变量:SparkSession.builder().getOrCreate()
PS。您可能还希望在spark.yarn.am.memory
属性中增加AM的内存。