更改OpenMPI可以“看到”的内核数

时间:2019-06-03 18:46:26

标签: mpi openmpi precompiled

我正在运行一个调用mpirun的可执行文件(我无权访问源代码)。我收到以下错误,如果请求的内核数量超过CPU上可用的内核数量,则很常见:

There are not enough slots available in the system to satisfy the 12
slots that were requested by the application:

  /Users/me/Library/app/executable

Either request fewer slots for your application, or make more slots
available for use.

A "slot" is the Open MPI term for an allocatable unit where we can
launch a process.  The number of slots available are defined by the
environment in which Open MPI processes are run:

  1. Hostfile, via "slots=N" clauses (N defaults to number of
     processor cores if not provided)
  2. The --host command line parameter, via a ":N" suffix on the
     hostname (N defaults to 1 if not provided)
  3. Resource manager (e.g., SLURM, PBS/Torque, LSF, etc.)
  4. If none of a hostfile, the --host command line parameter, or an
     RM is present, Open MPI defaults to the number of processor cores

In all the above cases, if you want Open MPI to default to the number
of hardware threads instead of the number of processor cores, use the
--use-hwthread-cpus option.

Alternatively, you can use the --oversubscribe option to ignore the
number of available slots when deciding the number of processes to
launch.

我的问题是我无法更改mpirun的命令行选项,例如使用--oversubscribe。相反,我需要更改OpenMPI“看到”的默认内核数。 (否则,这很容易解决,就像在this case中一样)

是否存在环境变量或可以更新的内容以欺骗openMPI起作用?

1 个答案:

答案 0 :(得分:0)

啊。我在/usr/local/etc/openmpi-default-hostfile(在Mac上)中找到了默认的OpenMPI主机文件,并在末尾添加了新行:

localhost slots=12

由于我的系统上有6个内核,因此OpenMPI正在读取默认的slots计数6(错误仅发生在请求的6个以上CPU上)。但是我有12个线程,并且想充分利用CPU。

这对我有用,因为我没有在命令行中运行mpirun