在SLURM批处理脚本中使用几个bash变量

时间:2020-01-24 22:18:34

标签: bash slurm

我想首先有效地提交工作。

因此,我制作了一个bash作业脚本和元脚本,如下所示。

首先,job.sh

#!/bin/sh -l
#SBATCH -J test
#SBATCH -p bigmem
#SBATCH -N 4
#SBATCH --ntasks-per-node 1
#SBATCH -o logs/%j_%x.out
#SBATCH -e logs/%j_%x.err
#SBATCH --time 1:00:00

module load gnu/8.2.0 openmpi/3.1.3_gnu8.2 anaconda/2.7

echo "START"; date

mpirun --n 120 ./e-opt -i input_rev.i \
       Mesh/file/file="${mesh_fpath}" \
       GlobalParams/s0="${act}" \
       Outputs/file_base="${outputs_fbase}"

echo "END"; date

第二个meta.sh

mapfile -t activities < activities.txt

mesh_path="inputs/*.inp"
mesh_files=($mesh_path)

output_path="outputs/test/"
for ((i=0;i<${#activities[@]};i++));
do
mesh_fname="${mesh_files[i]}"
fbasename="$(basename $mesh_fname)"
output_fbase="${output_path}${fbasename%.*}"
sbatch --export=act="${activities[i]}"\
        mesh_fpath="${mesh_files[i]}"\
        outputs_fbase="${output_fbase}" job.sh
done

我认为没有问题,但是当我提交工作时。弹出这样的错误消息。

--------------------------------------------------------------------------
An ORTE daemon has unexpectedly failed after launch and before
communicating back to mpirun. This could be caused by a number
of factors, including an inability to create a connection back
to mpirun due to a lack of common network interfaces and/or no
route found between them. Please check network connectivity
(including firewalls and network routing requirements).
--------------------------------------------------------------------------

我想念什么?请给我一些提示。谢谢!

0 个答案:

没有答案