Question

我正在尝试在spark中读取一个大的hbase表（大小约为100GB）。

Spark版本：1.6

Spark提交参数：

spark-submit --master yarn-client --num-executors 10  --executor-memory 4G 
             --executor-cores 4 
             --conf spark.yarn.executor.memoryOverhead=2048

错误：ExecutorLostFailure原因：容器被YARN杀死超过限制。 4.5GB的3GB物理内存使用限制。考虑提升spark.yarn.executor.memoryOverhead。

我尝试将spark.yarn.executor.memoryOverhead设为100000。仍然有类似的错误。

我不明白为什么火花如果内存不足就不会溢出到磁盘上，或者YARN会导致问题。

Answer 1

请分享您尝试阅读的代码。还有你的集群架构

YARN超过限制的容器被杀死。 4.5GB的3GB物理内存使用限制

尝试

spark-submit 
--master yarn-client 
--num-executors 4  
--executor-memory 100G
--executor-cores 4 
--conf spark.yarn.executor.memoryOverhead=20480

如果你有128 gRam

情况很明显，你用完ram，尝试以磁盘友好的方式重写你的代码。

Yarn Spark HBase - 由YARN杀死的ExecutorLostFailure容器超出内存限制

1 个答案: