Question

I have setup a spark (1.6)standalone cluster. have 1 master and added 3 machines under conf/slaves file as workers. Even though I have allocated 4GB memory to each of my workers in spark, why does it use only 1024MB when the application is running? I would like for it use all 4 GB allocated to it. Help me figure out where and what I am doing wrong.

Below is the screenshot of the spark master page (when the application is running using spark-submit) where under the Memory column it shows 1024.0 MB used in brackets next to 4.0 GB.

I also tried setting --executor-memory 4G option with spark-submit and it does not work (as suggested in How to change memory per node for apache spark worker).

These are the options I have set in spark-env.sh file

export SPARK_WORKER_CORES=3

export SPARK_WORKER_MEMORY=4g

export SPARK_WORKER_INSTANCES=2

Answer 1

另一种解决方法是尝试在conf/spark-defaults.conf文件中设置以下参数：

spark.driver.cores              4
spark.driver.memory             2g
spark.executor.memory           4g

一旦设置了上述内容（仅在您的情况下的最后一行），立即关闭所有工作人员并重新启动它们。最好以这种方式初始化执行程序内存，因为您的问题似乎是没有执行程序可以分配其工作程序的所有可用内存。

Answer 2

The parameter you are looking for is executor-memory try providing that to your spark application when you start it.

--executor-memory 4g

When you set worker-memory to 4g then the biggest executor you run on that worker is of 4g. PS: you can have different configurations (each having different worker memory).

Answer 3

在spark / conf目录中创建名为spark-env.sh的文件添加此行 SPARK_EXECUTOR_MEMORY =4克

Spark worker memory

3 个答案: