存在文件时运行GNU Parallel

时间:2013-10-24 16:54:32

标签: bash parallel-processing conditional-statements gnu-parallel

我目前正在从我的脚本生成大约100个文件,我想迭代一遍 这些文件分二十批,并通过另一个脚本执行,然后在完成后删除文件(清理)我相信GNU Parallel可以做到这一点,但我不确定如何做到这一点?

# test if files exists and run
if [ "$(ls -A ${base_dir}/schedule)" ]; then

    while [ "$(ls -A ${base_dir}/schedule)" ]; do

        # current run of 20 files
        batch=`ls ${base_dir}/schedule | head -n 20`

        # parallel run on 4 processors
        parallel -j4 ./script.sh ${batch} ::: {1..20}

        # cleanup
        for file in "${batch}"; do
            rm "${base_dir}/schedule/${file}"
        done

    done
fi

预期输出将类似于

# running first batch of twenty
 ./scipt.sh 1466-10389-data.nfo # after file has finished, rm 1466-10389-data.nfo
 ./scipt.sh 1466-10709-data.nfo # etc
 ./scipt.sh 1466-11230-data.nfo # etc
 ./scipt.sh 1466-11739-data.nfo
 ./scipt.sh 1466-11752-data.nfo
 ./scipt.sh 1466-13074-data.nfo
 ./scipt.sh 1466-14009-data.nfo
 ./scipt.sh 1466-1402-data.nfo
 ./scipt.sh 1466-14401-data.nfo
 ./scipt.sh 1466-14535-data.nfo
 ./scipt.sh 1466-1588-data.nfo
 ./scipt.sh 1466-17012-data.nfo
 ./scipt.sh 1466-17611-data.nfo
 ./scipt.sh 1466-18688-data.nfo
 ./scipt.sh 1466-19469-data.nfo
 ./scipt.sh 1466-19503-data.nfo
 ./scipt.sh 1466-21044-data.nfo
 ./scipt.sh 1466-21819-data.nfo
 ./scipt.sh 1466-22325-data.nfo
 ./scipt.sh 1466-23437-data.nfo

# wait till all are finished, OR queue up next file so  all times
# twenty files are running at until the directory is empty

2 个答案:

答案 0 :(得分:1)

如果我理解你想要做什么,并且如果schedule中的文件没有被连续创建,那么脚本可以替换为这两行(未经测试)

ls -A ${base_dir}/schedule | xargs -n 1 -P 4 ./script.sh 
rm "${base_dir}/schedule/*"

答案 1 :(得分:1)

我的猜测是你想要并行运行20个脚本:

ls -A ${base_dir}/schedule | parallel -j20 ./script.sh {/}\; rm {}

你的while循环让我有点困惑:是否需要,因为你运行时可能会添加更多文件?如果是这样,你需要添加while循环:

while [ "$(ls -A ${base_dir}/schedule)" ]; do
  ls -A ${base_dir}/schedule | parallel -j20 ./script.sh {/}\; rm {}
done

完成教程http://www.gnu.org/software/parallel/parallel_tutorial.html您的命令行会爱你。