我想知道我是否可以实现以下逻辑:
给定一组要完成的作业fold_num
和工作进程的限制数,比如说work_num
,我希望并行运行work_num
个进程,直到完成所有作业fold_num
。最后,对所有这些工作的结果还有一些其他处理。我们可以假设fold_num
总是work_num
的几倍。
到目前为止,我还没有得到以下代码段,其中包含来自How to wait in bash for several subprocesses to finish and return exit code !=0 when any subprocess ends with code !=0?的提示
#!/bin/bash
worker_num=5
fold_num=10
pids=""
result=0
for fold in $(seq 0 $(( $fold_num-1 ))); do
pids_idx=$(( $fold % ${worker_num} ))
echo "pids_idx=${pids_idx}, pids[${pids_idx}]=${pids[${pids_idx}]}"
wait ${pids[$pids_idx]} || let "result=1"
if [ "$result" == "1" ]; then
echo "some job is abnormal, aborting"
exit
fi
cmd="echo fold$fold" # use echo as an example, real command can be time-consuming to run
$cmd &
pids[${pids_idx}]="$!"
echo "pids=${pids[*]}"
done
# when the for-loop completes, do something else...
输出如下:
pids_idx=0, pids[0]=
pids=5846
pids_idx=1, pids[1]=
fold0
pids=5846 5847
fold1
pids_idx=2, pids[2]=
pids=5846 5847 5848
fold2
pids_idx=3, pids[3]=
pids=5846 5847 5848 5849
fold3
pids_idx=4, pids[4]=
pids=5846 5847 5848 5849 5850
pids_idx=0, pids[0]=5846
fold4
./test_wait.sh: line 12: wait: pid 5846 is not a child of this shell
some job is abnormal, aborting
问题:
1.似乎pids
阵列已记录正确的进程ID,但未能等待'对于。任何想法如何解决这一问题?
2.在for-loop之后我们需要使用wait
吗?如果是这样,在for-loop之后该怎么办?
答案 0 :(得分:0)
好吧,我想我得到了一个工作解决方案,其中提供了来自parallel
'的人们的提示。
export worker_names=("foo", "bar")
export worker_num=${#worker_names[@]}
function some_computation {
fold=$1
cmd="..." #involves worker_names and fold
echo $cmd; $cmd
}
export -f some_computation # important, to make this function visible to subprocesses
for fold in $(seq 0 $(( $fold_num-1 ))); do
sem -j $worker_num some_computation $fold
done
sem --wait # wait for all jobs to complete
# do something below
这里有几件事:
parallel
工作,因为我需要在那些并行作业之后进行后计算处理。我试过的并行版本未能等待完成工作。所以我使用GNU sem
代表信号量。 输出计算功能也是必要的。请注意-f
选项。
sem --wait
完全满足了等待并行作业的需求。
HTH。