java.lang.OutOfMemoryError:使用Docker的Java堆空间

时间:2018-06-13 17:16:47

标签: apache-spark docker-compose spark-submit

所以我在本地运行以下(独立):

~/spark-2.1.0-bin-hadoop2.7/bin/spark-submit --py-files afile.py run_script.py 

我收到以下错误:

java.lang.OutOfMemoryError: Java heap space

为了超越这个,我正在运行以下内容:

~/spark-2.1.0-bin-hadoop2.7/bin/spark-submit --driver-memory 6G --executor-memory 1G --py-files afile.py run_script.py

并且脚本正常运行。

现在,我正在使用以下docker build for Spark并执行以下操作:

docker-compose up
docker exec app_master_1 bin/spark-submit --driver-memory 6G --executor-memory 1G --py-files afile.py run_script.py

在这种情况下,我仍然会收到以下错误:

2018-06-13 21:43:16 WARN  TaskSetManager:66 - Lost task 0.0 in stage 3.0 (TID 9, 172.17.0.3, executor 0): java.lang.OutOfMemoryError: Java heap space
at org.apache.spark.sql.catalyst.expressions.codegen.BufferHolder.grow(BufferHolder.java:77)
at org.apache.spark.sql.catalyst.expressions.codegen.UnsafeRowWriter.write(UnsafeRowWriter.java:219)
at org.apache.spark.sql.execution.datasources.text.TextFileFormat$$anonfun$readToUnsafeMem$1$$anonfun$apply$4.apply(TextFileFormat.scala:143)
at org.apache.spark.sql.execution.datasources.text.TextFileFormat$$anonfun$readToUnsafeMem$1$$anonfun$apply$4.apply(TextFileFormat.scala:140)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.next(FileScanRDD.scala:109)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source)
at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$10$$anon$1.hasNext(WholeStageCodegenExec.scala:614)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$class.foreach(Iterator.scala:893)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
at scala.collection.TraversableOnce$class.foldLeft(TraversableOnce.scala:157)
at scala.collection.AbstractIterator.foldLeft(Iterator.scala:1336)
at scala.collection.TraversableOnce$class.fold(TraversableOnce.scala:212)
at scala.collection.AbstractIterator.fold(Iterator.scala:1336)
at org.apache.spark.rdd.RDD$$anonfun$fold$1$$anonfun$19.apply(RDD.scala:1090)
at org.apache.spark.rdd.RDD$$anonfun$fold$1$$anonfun$19.apply(RDD.scala:1090)
at org.apache.spark.SparkContext$$anonfun$33.apply(SparkContext.scala:2123)
at org.apache.spark.SparkContext$$anonfun$33.apply(SparkContext.scala:2123)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:109)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)

以后某个地方:

2018-06-13 21:43:17 ERROR TaskSchedulerImpl:70 - Lost executor 0 on 172.17.0.3: Remote RPC client disassociated. Likely due to containers exceeding thresholds, or network issues. Check driver logs for WARN messages

据我所知,尽管它写的是它在执行器0中的内存不足我必须增加驱动程序内存作为它的独立,对吗?

知道为什么会发生这种情况以及如何绕过它?

修改

当我尝试使用文件不够大的sqlCont.read.json(json_path)时,会发生错误。

1 个答案:

答案 0 :(得分:0)

正如您所看到的here,在docker脚本中使用1 GB内存初始化Worker节点,并为执行程序执行带有1 GB内存的spark-submit命令,因此要么减少执行程序内存,要么增加工作程序内存在创建docker容器时。