如何使用cloudera manager调整spark execuator PermGen参数?

时间:2016-05-30 03:49:20

标签: apache-spark yarn cloudera-manager

当我尝试在Spark on Yarn集群中运行随机森林时(3个数据节点)。我遇到了 OutOfMemoryError异常

以下是容器登录节点管理器

上的错误堆栈

OutOfMemoryError异常日志

16/05/30 13:41:17 WARN yarn.YarnAllocator: Expected to find pending requests, but found none. Exception in thread "dispatcher-event-loop-4" java.lang.OutOfMemoryError: PermGen space at sun.misc.Unsafe.defineClass(Native Method) at sun.reflect.ClassDefiner.defineClass(ClassDefiner.java:63) at sun.reflect.MethodAccessorGenerator$1.run(MethodAccessorGenerator.java:399) at sun.reflect.MethodAccessorGenerator$1.run(MethodAccessorGenerator.java:396) at java.security.AccessController.doPrivileged(Native Method) at sun.reflect.MethodAccessorGenerator.generate(MethodAccessorGenerator.java:395) at sun.reflect.MethodAccessorGenerator.generateSerializationConstructor(MethodAccessorGenerator.java:113) at sun.reflect.ReflectionFactory.newConstructorForSerialization(ReflectionFactory.java:331) at java.io.ObjectStreamClass.getSerializableConstructor(ObjectStreamClass.java:1376) at java.io.ObjectStreamClass.access$1500(ObjectStreamClass.java:72) at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:493) at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:468) at java.security.AccessController.doPrivileged(Native Method) at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:468) at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:365) at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:464) at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:365) at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1133) at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547) at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508) at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431) at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177) at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:347) at org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:44) at org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:101) at org.apache.spark.rpc.netty.NettyRpcEnv.serialize(NettyRpcEnv.scala:251) at org.apache.spark.rpc.netty.NettyRpcEnv.ask(NettyRpcEnv.scala:228) at org.apache.spark.rpc.netty.NettyRpcEndpointRef.ask(NettyRpcEnv.scala:509) at org.apache.spark.rpc.RpcEndpointRef.ask(RpcEndpointRef.scala:62) at org.apache.spark.storage.BlockManagerMasterEndpoint$$anonfun$org$apache$spark$storage$BlockManagerMasterEndpoint$$removeRdd$2.apply(BlockManagerMasterEndpoint.scala:147) at org.apache.spark.storage.BlockManagerMasterEndpoint$$anonfun$org$apache$spark$storage$BlockManagerMasterEndpoint$$removeRdd$2.apply(BlockManagerMasterEndpoint.scala:146) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) 16/05/30 13:41:21 INFO yarn.YarnAllocator: Canceling requests for 0 executor containers

来自日志的

配置

LD_LIBRARY_PATH="/usr/lib/hadoop/lib/native:$LD_LIBRARY_PATH" {{JAVA_HOME}}/bin/java -server -XX:OnOutOfMemoryError='kill %p' -Xms1024m -Xmx1024m -Djava.io.tmpdir={{PWD}}/tmp '-Dspark.authenticate=false' '-Dspark.shuffle.service.port=7337' '-Dspark.driver.port=34896' '-Dspark.ui.port=0' -Dspark.yarn.app.container.log.dir=<LOG_DIR> -XX:MaxPermSize=256m org.apache.spark.executor.CoarseGrainedExecutorBackend --driver-url spark://CoarseGrainedScheduler@172.26.34.93:34896 --executor-id 2 --hostname datalake01 --cores 1 --app-id application_1464237978069_0248 --user-class-path file:$PWD/__app__.jar --user-class-path file:$PWD/com.databricks_spark-csv_2.11-1.4.0.jar --user-class-path file:$PWD/org.apache.commons_commons-csv-1.1.jar --user-class-path file:$PWD/com.univocity_univocity-parsers-1.5.1.jar 1> <LOG_DIR>/stdout 2> <LOG_DIR>/stderr

如何调整PermGen尺寸?

这个-XX:MaxPermSize=256m是我要调整的参数。但是如何在Cloudera Manager中调整此参数?

1 个答案:

答案 0 :(得分:3)

您可以将JVM参数添加到spark-submit命令。 重要的是要注意,您有驱动程序和执行程序的配置。

spark/bin/spark-submit ... --conf spark.driver.extraJavaOptions=" -XX:MaxPermSize=256M " --conf spark.executor.extraJavaOptions=" -XX:MaxPermSize=256M " ...