我已经好几天了,但无法解决我的问题。
我正在跑步:
mpiexec -hostfile ~/machines -nolocal -pernode mkdir -p $dstpath
其中$ dstpath指向当前目录,“machines”是包含以下内容的文件:
node01
node02
node03
node04
这是错误输出:
Failed to parse XML input with the minimalistic parser. If it was not
generated by hwloc, try enabling full XML support with libxml2.
[node01:06177] [[6421,0],0] ORTE_ERROR_LOG: Error in file base/plm_base_launch_support.c at line 891
--------------------------------------------------------------------------
ORTE was unable to reliably start one or more daemons.
This usually is caused by:
* not finding the required libraries and/or binaries on
one or more nodes. Please check your PATH and LD_LIBRARY_PATH
settings, or configure OMPI with --enable-orterun-prefix-by-default
* lack of authority to execute on one or more specified nodes.
Please verify your allocation and authorities.
* the inability to write startup files into /tmp (--tmpdir/orte_tmpdir_base).
Please check with your sys admin to determine the correct location to use.
* compilation of the orted with dynamic libraries when static are required
(e.g., on Cray). Please check your configure cmd line and consider using
one of the contrib/platform definitions for your system type.
* an inability to create a connection back to mpirun due to a
lack of common network interfaces and/or no route found between
them. Please check network connectivity (including firewalls
and network routing requirements).
--------------------------------------------------------------------------
[node01:06177] 1 more process has sent help message help-errmgr-base.txt / failed-daemon-launch
[node01:06177] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
Failed to parse XML input with the minimalistic parser. If it was not
generated by hwloc, try enabling full XML support with libxml2.
[node01:06181] [[6417,0],0] ORTE_ERROR_LOG: Error in file base/plm_base_launch_support.c at line 891
我有4台机器,node01到node04。为了登录这4个节点,我必须先登录到node00。我正在尝试运行一些分布式图形函数。图形软件安装在node01中,并且应该使用mpiexec与其他节点同步。
我做了什么:
确保所有无密码登录都已设置,每台机器都可以ssh到任何其他没有问题的机器。
在主目录中有一个主机文件。
echo $ PATH提供/home/myhome/bin:/home/myhome/.local/bin:/usr/include/openmpi:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin
echo $ LD_LIBRARY_PATH给出
/usr/lib/openmpi/lib
之前已经有过这样的工作,但它突然开始出现这些错误。我让我的管理员安装新机器,但它仍然给出了这样的错误。我试过一次做一个节点,但它给出了同样的错误。我根本不熟悉命令行,所以请给我一些建议。我尝试从源代码和sudo apt-get install openmpi-bin
重新安装OpenMPI。我在Ubuntu 16.04 LTS上。
答案 0 :(得分:-1)
你应该专注于修复:
无法使用minimalistic解析器解析XML输入。如果不是 由hwloc生成,尝试使用libxml2启用完整的XML支持。 [node01:06177] [[6421,0],0] ORTE_ERROR_LOG:文件库/ plm_base_launch_support.c第891行出错