我想首先有效地提交工作。
因此,我制作了一个bash作业脚本和元脚本,如下所示。
首先,job.sh
#!/bin/sh -l
#SBATCH -J test
#SBATCH -p bigmem
#SBATCH -N 4
#SBATCH --ntasks-per-node 1
#SBATCH -o logs/%j_%x.out
#SBATCH -e logs/%j_%x.err
#SBATCH --time 1:00:00
module load gnu/8.2.0 openmpi/3.1.3_gnu8.2 anaconda/2.7
echo "START"; date
mpirun --n 120 ./e-opt -i input_rev.i \
Mesh/file/file="${mesh_fpath}" \
GlobalParams/s0="${act}" \
Outputs/file_base="${outputs_fbase}"
echo "END"; date
第二个meta.sh
mapfile -t activities < activities.txt
mesh_path="inputs/*.inp"
mesh_files=($mesh_path)
output_path="outputs/test/"
for ((i=0;i<${#activities[@]};i++));
do
mesh_fname="${mesh_files[i]}"
fbasename="$(basename $mesh_fname)"
output_fbase="${output_path}${fbasename%.*}"
sbatch --export=act="${activities[i]}"\
mesh_fpath="${mesh_files[i]}"\
outputs_fbase="${output_fbase}" job.sh
done
我认为没有问题,但是当我提交工作时。弹出这样的错误消息。
--------------------------------------------------------------------------
An ORTE daemon has unexpectedly failed after launch and before
communicating back to mpirun. This could be caused by a number
of factors, including an inability to create a connection back
to mpirun due to a lack of common network interfaces and/or no
route found between them. Please check network connectivity
(including firewalls and network routing requirements).
--------------------------------------------------------------------------
我想念什么?请给我一些提示。谢谢!