Spark执行器经常是java GC

时间:2018-01-31 11:40:01

标签: apache-spark garbage-collection

Spark执行器以以下选项开始

/root/spark/jdk1.8.0_151/bin/java -cp /root/spark/spark-2.2.0-bin-hadoop2.7/conf/:/root/spark/spark-2.2.0-bin-hadoop2.7/jars/* -Xmx6144M -Dspark.driver.port=20637 -XX:+PrintGCDetails -XX:+PrintGCTimeStamps org.apache.spark.executor.CoarseGrainedExecutorBackend --driver-url spark://CoarseGrainedScheduler@172.16.50.102:20637 --executor-id 29 --hostname 172.16.50.103 --cores 2 --app-id app-20180131184049-0002 --worker-url spark://Worker@172.16.50.103:39368

我经常使用Java GC(from executor's log):

2.431: [GC (Metadata GC Threshold) [PSYoungGen: 362763K->34308K(611840K)] 362763K->34396K(2010112K), 0.0780262 secs] [Times: user=1.09 sys=0.18, real=0.08 secs]
2.509: [Full GC (Metadata GC Threshold) [PSYoungGen: 34308K->0K(611840K)] [ParOldGen: 88K->32991K(772096K)] 34396K->32991K(1383936K), [Metaspace: 20866K->20866K(1067008K)], 0.0541261 secs] [Times: user=0.70 sys=0.08, real=0.05 secs]
303.670: [GC (Allocation Failure) [PSYoungGen: 524800K->87035K(834560K)] 557791K->266418K(1606656K), 0.1241616 secs] [Times: user=2.92 sys=0.51, real=0.12 secs]
315.196: [GC (Allocation Failure) [PSYoungGen: 834555K->87032K(1136640K)] 1013938K->981300K(2037248K), 0.4551608 secs] [Times: user=12.47 sys=5.12, real=0.46 secs]
315.651: [Full GC (Ergonomics) [PSYoungGen: 87032K->69466K(1136640K)] [ParOldGen: 894267K->887330K(2752000K)] 981300K->956797K(3888640K), [Metaspace: 34446K->34446K(1079296K)], 5.9107553 secs] [Times: user=227.48 sys=4.56, real=5.91 secs]
336.571: [GC (Allocation Failure) [PSYoungGen: 1119066K->87030K(1225728K)] 2006397K->1979465K(3977728K), 0.7949645 secs] [Times: user=22.85 sys=10.80, real=0.79 secs]
337.366: [Full GC (Ergonomics) [PSYoungGen: 87030K->0K(1225728K)] [ParOldGen: 1892434K->1975360K(4194304K)] 1979465K->1975360K(5420032K), [Metaspace: 34446K->34446K(1079296K)], 12.1924380 secs] [Times: user=488.02 sys=4.94, real=12.20 secs]
366.596: [GC (Allocation Failure) [PSYoungGen: 1138688K->87012K(1225728K)] 3114048K->3116557K(5420032K), 0.9059287 secs] [Times: user=31.37 sys=5.71, real=0.91 secs]
367.502: [Full GC (Ergonomics) [PSYoungGen: 87012K->0K(1225728K)] [ParOldGen: 3029544K->3096222K(4194304K)] 3116557K->3096222K(5420032K), [Metaspace: 34449K->34449K(1079296K)], 13.1129752 secs] [Times: user=518.70 sys=11.04, real=13.11 secs]
396.419: [Full GC (Ergonomics) [PSYoungGen: 1138688K->1023K(1225728K)] [ParOldGen: 3096222K->4193874K(4194304K)] 4234910K->4194898K(5420032K), [Metaspace: 34456K->34456K(1079296K)], 17.7615804 secs] [Times: user=714.95 sys=22.54, real=17.76 secs]
430.400: [Full GC (Ergonomics) [PSYoungGen: 1138688K->1101822K(1225728K)] [ParOldGen: 4193874K->4193922K(4194304K)] 5332562K->5295744K(5420032K), [Metaspace: 34462K->34462K(1079296K)], 24.3810387 secs] [Times: user=997.79 sys=15.83, real=24.38 secs]
454.851: [Full GC (Ergonomics) [PSYoungGen: 1138688K->1130794K(1225728K)] [ParOldGen: 4193922K->4193922K(4194304K)] 5332610K->5324716K(5420032K), [Metaspace: 34477K->34477K(1079296K)], 26.3723404 secs] [Times: user=1086.31 sys=11.56, real=26.37 secs]
481.226: [Full GC (Ergonomics) [PSYoungGen: 1138688K->1130798K(1225728K)] [ParOldGen: 4193922K->4193922K(4194304K)] 5332610K->5324720K(5420032K), [Metaspace: 34477K->34477K(1079296K)], 19.2936132 secs] [Times: user=779.84 sys=22.07, real=19.30 secs]
500.521: [Full GC (Ergonomics) [PSYoungGen: 1138688K->1130862K(1225728K)] [ParOldGen: 4193922K->4193922K(4194304K)] 5332610K->5324784K(5420032K), [Metaspace: 34477K->34477K(1079296K)], 22.6870152 secs] [Times: user=926.71 sys=18.37, real=22.69 secs]

导致频繁的GC,使执行程序长时间挂起,并且无法向驱动程序报告状态,因此驱动程序会终止执行程序。

The Spark program:

Iterator iter = this.dbtable.entrySet().iterator();
while (iter.hasNext()) {
    Map.Entry me = (Map.Entry) iter.next();
    String dt = "(" + me.getValue() + ")" + me.getKey();
    logger.info("[\033[32m" + dt + "\033[0m]");
    Dataset<Row> jdbcDF = ss.read().format("jdbc")
            .option("driver", "com.mysql.jdbc.Driver")
            .option("url", this.url)
            .option("dbtable", dt)
            .option("user", this.user)
            .option("password", this.password)
            .option("useSSL", false)
            .load();
    jdbcDF.createOrReplaceTempView((String) me.getKey());
}
Dataset<Row> result = ss.sql(this.sql);
result.write().format("jdbc")
        .option("driver", "com.mysql.jdbc.Driver")
        .option("url", this.dst_url)
        .option("dbtable", this.dst_table)
        .option("user", this.user)
        .option("password", this.password)
        .option("useSSL", false)
        .option("rewriteBatchedStatements", true)
        .option("sessionVariables","sql_log_bin=off")
        .save();

The stacktrace:

java.lang.OutOfMemoryError: Java heap space
    at java.util.Arrays.copyOf(Arrays.java:3181)
    at java.util.ArrayList.grow(ArrayList.java:265)
    at java.util.ArrayList.ensureExplicitCapacity(ArrayList.java:239)
    at java.util.ArrayList.ensureCapacityInternal(ArrayList.java:231)
    at java.util.ArrayList.add(ArrayList.java:462)
    at com.mysql.jdbc.MysqlIO.readSingleRowSet(MysqlIO.java:3414)
    at com.mysql.jdbc.MysqlIO.getResultSet(MysqlIO.java:470)
    at com.mysql.jdbc.MysqlIO.readResultsForQueryOrUpdate(MysqlIO.java:3112)
    at com.mysql.jdbc.MysqlIO.readAllResults(MysqlIO.java:2341)
    at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2736)
    at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2484)
    at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:1858)
    at com.mysql.jdbc.PreparedStatement.executeQuery(PreparedStatement.java:1966)
    at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD.compute(JDBCRDD.scala:301)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
    at org.apache.spark.scheduler.Task.run(Task.scala:108)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

0 个答案:

没有答案