Question

我目前正在从我的脚本生成大约100个文件，我想迭代一遍这些文件分二十批，并通过另一个脚本执行，然后在完成后删除文件（清理）我相信GNU Parallel可以做到这一点，但我不确定如何做到这一点？

# test if files exists and run
if [ "$(ls -A ${base_dir}/schedule)" ]; then

    while [ "$(ls -A ${base_dir}/schedule)" ]; do

        # current run of 20 files
        batch=`ls ${base_dir}/schedule | head -n 20`

        # parallel run on 4 processors
        parallel -j4 ./script.sh ${batch} ::: {1..20}

        # cleanup
        for file in "${batch}"; do
            rm "${base_dir}/schedule/${file}"
        done

    done
fi

预期输出将类似于

# running first batch of twenty
 ./scipt.sh 1466-10389-data.nfo # after file has finished, rm 1466-10389-data.nfo
 ./scipt.sh 1466-10709-data.nfo # etc
 ./scipt.sh 1466-11230-data.nfo # etc
 ./scipt.sh 1466-11739-data.nfo
 ./scipt.sh 1466-11752-data.nfo
 ./scipt.sh 1466-13074-data.nfo
 ./scipt.sh 1466-14009-data.nfo
 ./scipt.sh 1466-1402-data.nfo
 ./scipt.sh 1466-14401-data.nfo
 ./scipt.sh 1466-14535-data.nfo
 ./scipt.sh 1466-1588-data.nfo
 ./scipt.sh 1466-17012-data.nfo
 ./scipt.sh 1466-17611-data.nfo
 ./scipt.sh 1466-18688-data.nfo
 ./scipt.sh 1466-19469-data.nfo
 ./scipt.sh 1466-19503-data.nfo
 ./scipt.sh 1466-21044-data.nfo
 ./scipt.sh 1466-21819-data.nfo
 ./scipt.sh 1466-22325-data.nfo
 ./scipt.sh 1466-23437-data.nfo

# wait till all are finished, OR queue up next file so  all times
# twenty files are running at until the directory is empty

Answer 1

如果我理解你想要做什么，并且如果schedule中的文件没有被连续创建，那么脚本可以替换为这两行（未经测试）

ls -A ${base_dir}/schedule | xargs -n 1 -P 4 ./script.sh 
rm "${base_dir}/schedule/*"

Answer 2

我的猜测是你想要并行运行20个脚本：

ls -A ${base_dir}/schedule | parallel -j20 ./script.sh {/}\; rm {}

你的while循环让我有点困惑：是否需要，因为你运行时可能会添加更多文件？如果是这样，你需要添加while循环：

while [ "$(ls -A ${base_dir}/schedule)" ]; do
  ls -A ${base_dir}/schedule | parallel -j20 ./script.sh {/}\; rm {}
done

完成教程http://www.gnu.org/software/parallel/parallel_tutorial.html您的命令行会爱你。

存在文件时运行GNU Parallel

2 个答案: