应用错误收集

When I am running a spark application on yarn, with driver and executor memory settings as --driver-memory 4G --executor-memory 2G

Then when I run the application, an exceptions throws complaining that Container killed by YARN for exceeding memory limits. 2.5 GB of 2.5 GB physical memory used. Consider boosting spark.yarn.executor.memoryOverhead.

What does this 2.5 GB mean here? (overhead memory, executor memory or overhead+executor memory?)I ask so because when I change the the memory settings as:

--driver-memory 4G --executor-memory 4G --conf --driver-memory 4G --conf spark.yarn.executor.memoryOverhead=2048,then the exception disappears.

I would ask, although I have boosted the overhead memory to 2G, it is still under 2.5G, why does it work now?

让我们了解如何在火花中的各个区域之间分配内存。

执行器MemoryOverhead：

spark.yarn.executor.memoryOverhead = max(384 MB, .07 * spark.executor.memory)。在第一种情况下，memoryOverhead = max(384 MB, 0.07 * 2 GB) = max(384 MB, 143.36 MB)因此，假设您为每个执行者分配了一个单核，memoryOverhead = 384 MB将在每个执行者中保留。

执行和存储内存：

默认情况下为spark.memory.fraction = 0.6，这意味着执行和存储作为统一区域占据了剩余内存的60％，即998 MB。除非启用spark.memory.useLegacyMode，否则没有严格的边界分配给每个区域。否则，它们共享移动边界。

用户内存：

分配执行和存储内存后仍保留的内存池，完全取决于您以自己喜欢的方式使用它。您可以在那里存储自己的数据结构，以用于RDD转换。例如，您可以通过使用mapPartitions转换来维护此哈希表的哈希表来重写Spark聚合。这包括MemoryOverhead之后剩余的40％内存。您的情况是~660 MB。

如果您的工作没有满足上述任何分配条件，那么很有可能最终导致OOM问题。

Understanding spark.yarn.executor.memoryOverhead

1 个答案: