why does not spark.executor.instances work?

时间:2018-07-25 04:51:42

标签: apache-spark

I'm using 40 r4.2xlarge slaves and one master with the same type host. r4.2xlarge has 8 cores with 61GB Memory.

Given settings are:

  • spark.executor.instances 280
  • spark.executor.cores 1
  • spark.executor.memory 8G
  • spark.driver.memory 40G
  • spark.yarn.executor.memoryOverhead 10240
  • spark.dynamicAllocation.enabled false

When observing a job running with this cluster in its Ganglia, overall cpu usage is around 30% only. and its resource manager "Aggregated Metrics by Executor" table shows only two executors per slave node.

Why does this cluster run only two executors per slave node even with 280 spark.executor.instances?

---- UPDATE ----

I found the yarn-site.xml under /etc/hadoop/conf.empty

  • yarn.scheduler.maximum-allocation-mb 54272
  • yarn.scheduler.maximum-allocation-vcores 128
  • yarn.nodemanager.resource.cpu-vcores 8
  • yarn.nodemanager.resource.memory-mb 54272

1 个答案:

答案 0 :(得分:1)

If you are running job on the YARN, the number of executors is not only allocate by this parameter, but a number that depends on the some configuration parameters in the YARN. Possibly parameters are:

yarn.scheduler.maximum-allocation-mb
yarn.scheduler.maximum-allocation-vcores
yarn.nodemanager.resource.cpu-vcores
yarn.nodemanager.resource.memory-mb

Please check that parameters in yarn-site.xml