
时间:2018-03-12 16:06:13

标签: mpi openmpi mpi4py

我正在运行OpenAI基线,特别是Hindsight Experience Replay代码。 (但是,我认为这个问题与代码无关,并且与MPI相关,因此我在StackOverflow上发帖。)

You can see the README there但关键是,要运行的命令是:

python -m baselines.her.experiment.train --num_cpu 20


我在一台机器上成功运行了带有1-4个CPU的HER训练脚本(例如,--num_cpu x用于x = 1,2,3,4):

  • Ubuntu 16.04
  • Python 3.5.2
  • TensorFlow 1.5.0
  • 一个TitanX GPU

CPU的数量似乎是8,因为我有一个具有超线程的四核i7 Intel处理器,并且Python确认它可以看到8个CPU。

(py3-tensorflow) daniel@titan:~/baselines$ ipython
Python 3.5.2 (default, Nov 23 2017, 16:37:01) 
Type 'copyright', 'credits' or 'license' for more information
IPython 6.2.1 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import os, multiprocessing

In [2]: os.cpu_count()
Out[2]: 8

In [3]: multiprocessing.cpu_count()
Out[3]: 8


(py3-tensorflow) daniel@titan:~/baselines$ python -m baselines.her.experiment.train --num_cpu 5
A request was made to bind to that would result in binding more
processes than cpus on a resource:

   Bind to:     CORE
   Node:        titan
   #processes:  2
   #cpus:       1

You can override this protection by adding the "overload-allowed"
option to your binding directive.


此代码在高级别工作的方式是它接受此参数并使用python subprocess模块运行mpirun命令。但是,在命令行上检查mpirun --help不会将overload-allowed显示为有效参数。



如果有帮助的话,我的虚拟环境中的pip list

(py3.5-mpi-practice) daniel@titan:~$ pip list
DEPRECATION: The default format will switch to columns in the future. You can use --format=(legacy|columns) (or define a format=(legacy|columns) in your pip.conf under the [list] section) to disable this warning.
decorator (4.2.1)
ipython (6.2.1)
ipython-genutils (0.2.0)
jedi (0.11.1)
line-profiler (2.1.2)
mpi4py (3.0.0)
numpy (1.14.1)
parso (0.1.1)
pexpect (4.4.0)
pickleshare (0.7.4)
pip (9.0.1)
pkg-resources (0.0.0)
pprintpp (0.3.0)
prompt-toolkit (1.0.15)
ptyprocess (0.5.2)
Pygments (2.2.0)
setuptools (20.7.0)
simplegeneric (0.8.1)
six (1.11.0)
traitlets (4.3.2)
wcwidth (0.1.7)

所以,TL; DR:

  • 如何在我的代码中修复此错误?
  • 如果我添加“允许超载”的事情,会发生什么?这样安全吗?


2 个答案:

答案 0 :(得分:0)

overload-allowed是传递给--bind-to mpirun mpirun ... --bind-to core overload-allowed 参数function getNewResources (currentUrl, nextUrl) { let resourcesCurrentUrl = getResourceByUrl(currentUrl); let resourcesNextUrl = getResourceByUrl(nextUrl); // Diff resources ACCESS DATA FROM ABOVE VARIABLES HERE! // Return new resources } function getResourceByUrl (url) { let xmlhttp = new XMLHttpRequest(); let xmlResponse; xmlhttp.onreadystatechange = function() { if (this.readyState == 4) { // when succesfull var resources = extractResourcesFromXMLResponse(this.response); return resources; } }; xmlhttp.open("GET", url, true); xmlhttp.send(); } function extractResourcesFromXMLResponse (xmlResponse) { let resources = []; // Add images imagePaths = extractImages(xmlResponse); resources.push(imagePaths); return resources; } function extractImages(xmlDoc) { let match, extractedImages = [], newArr = [], rex = /<img.*?src="([^">]*\/([^">]*?))".*?>/g; while ( match = rex.exec( xmlDoc ) ) { extractedImages.push( match[1] ); } return extractedImages; } }的限定符。确切的语法对我来说是未知的,但我从




因此,过载&#34;过载是安全的。 &#34;核心&#34;,但它不是性能提升候选人#1。



答案 1 :(得分:0)

错误消息表明mpi4py建立在Open MPI之上。


mpirun --use-hwthread-cpus ...