Question

我有一些我想要调用的shell命令列表。最多四个进程应同时运行。

我的基本想法是将命令发送到shell，直到4个命令处于活动状态。然后，该脚本通过查找公共字符串来不断检查所有进程的进程计数，例如＆＃34; nohup scrapy crawl urlMonitor＆＃34;。

只要进程数低于4，下一个命令就会发送到shell，直到所有命令都完成。

有没有办法用shell脚本执行此操作？我想它会涉及某种无限循环，中断条件以及检查活动进程的方法。不幸的是，我在shell脚本方面不是很好，所以也许有人可以引导我走向正确的方向？

nohup scrapy crawl urlMonitor -a slice=0 &
nohup scrapy crawl urlMonitor -a slice=1 &
nohup scrapy crawl urlMonitor -a slice=2 &
nohup scrapy crawl urlMonitor -a slice=3 &
nohup scrapy crawl urlMonitor -a slice=4 &
nohup scrapy crawl urlMonitor -a slice=5 &
nohup scrapy crawl urlMonitor -a slice=6 &
nohup scrapy crawl urlMonitor -a slice=7 &
nohup scrapy crawl urlMonitor -a slice=8 &
nohup scrapy crawl urlMonitor -a slice=9 &
nohup scrapy crawl urlMonitor -a slice=10 &
nohup scrapy crawl urlMonitor -a slice=11 &
nohup scrapy crawl urlMonitor -a slice=12 &
nohup scrapy crawl urlMonitor -a slice=13 &
nohup scrapy crawl urlMonitor -a slice=14 &
nohup scrapy crawl urlMonitor -a slice=15 &
nohup scrapy crawl urlMonitor -a slice=16 &
nohup scrapy crawl urlMonitor -a slice=17 &
nohup scrapy crawl urlMonitor -a slice=18 &
nohup scrapy crawl urlMonitor -a slice=19 &
nohup scrapy crawl urlMonitor -a slice=20 &
nohup scrapy crawl urlMonitor -a slice=21 &
nohup scrapy crawl urlMonitor -a slice=22 &
nohup scrapy crawl urlMonitor -a slice=23 &
nohup scrapy crawl urlMonitor -a slice=24 &
nohup scrapy crawl urlMonitor -a slice=25 &
nohup scrapy crawl urlMonitor -a slice=26 &
nohup scrapy crawl urlMonitor -a slice=27 &
nohup scrapy crawl urlMonitor -a slice=28 &
nohup scrapy crawl urlMonitor -a slice=29 &
nohup scrapy crawl urlMonitor -a slice=30 &
nohup scrapy crawl urlMonitor -a slice=31 &
nohup scrapy crawl urlMonitor -a slice=32 &
nohup scrapy crawl urlMonitor -a slice=33 &
nohup scrapy crawl urlMonitor -a slice=34 &
nohup scrapy crawl urlMonitor -a slice=35 &
nohup scrapy crawl urlMonitor -a slice=36 &
nohup scrapy crawl urlMonitor -a slice=37 &
nohup scrapy crawl urlMonitor -a slice=38 &

Answer 1

如果您希望一次4个连续运行，请尝试以下方法：

max_procs=4
active_procs=0

for proc_num in {0..38}; do
    nohup your_cmd_here &

    # If we have more than max procs running, wait for one to finish
    if ((active_procs++ >= max_procs)); then
        wait -n
        ((active_procs--))
    fi
done

# Wait for all remaining procs to finish
wait

这是对斯齐尼克答案的一种变体，它可以同时保持max_procs运行。一旦完成，它就会开始下一个。 wait -n命令等待下一个进程完成，而不是等待所有进程完成。

Answer 2

尝试这样做：

for i in {0..38}; do
    nohup scrapy crawl urlMonitor -a slice=$i & _pid=$!
    ((++i%4==0)) && wait $_pid
done

help wait：

wait: wait [-n] [id ...]
Wait for job completion and return exit status.

Waits for each process identified by an ID, which may be a process ID or a
job specification, and reports its termination status.  If ID is not
given, waits for all currently active child processes, and the return
status is zero.  If ID is a a job specification, waits for all processes
in that job's pipeline.

If the -n option is supplied, waits for the next job to terminate and
returns its exit status.

Exit Status:
Returns the status of the last ID; fails if ID is invalid or an invalid
option is given.

Answer 3

这是一种通用方法，可以始终确保在启动任何其他作业之前少于4个作业（但是，如果一行启动多个作业，则可能同时有多于4个作业）：

#!/bin/bash

max_nb_jobs=4
commands_file=$1

while IFS= read -r line; do
   while :; do
      mapfile -t jobs < <(jobs -pr)
      ((${#jobs[@]}<max_nb_jobs)) && break
      wait -n
   done
   eval "$line"
done < "$commands_file"

wait

将此脚本与您的文件一起用作第一个参数。

它是如何工作的？对于每行line读取，我们首先通过计算正在运行的作业数量（从max_nb_jobs获得）来确保运行少于jobs -pr。如果超过max_nb_jobs，我们等待下一个作业终止（wait -n），并再次计算正在运行的作业数。如果运行时间少于max_nb_jobs，我们eval行line。

更新

这是一个不使用wait -n的类似脚本。它似乎可以完成这项工作（使用Bash 4.2在Debian上测试）：

#!/bin/bash

set -m

max_nb_jobs=4
file_list=$1

sleep_jobs() {
   # This function sleeps until there are less than $1 jobs running
   # Make sure that you have set -m before using this function!
   local n=$1 jobs
   while mapfile -t jobs < <(jobs -pr) && ((${#jobs[@]}>=n)); do
      coproc read
      trap "echo >&${COPROC[1]}; trap '' SIGCHLD" SIGCHLD
      wait $COPROC_PID
   done
}

while IFS= read -r line; do
   sleep_jobs $max_nb_jobs
   eval "$line"
done < "$file_list"

wait

Answer 4

您可以使用GNU parallel或甚至xargs轻松完成此操作。即：

declare -i i=0
while sleep 1; do
    printf 'slice=%d\n' $((i++))
done | xargs -n1 -P3 nohup scrapy crawl urlMonitor -a

while循环将永远运行;如果你知道有一个实际的硬限制，你可以像for一样进行循环：

for i in {0..100}…

此外，sleep 1非常有用，因为它可让shell更有效地处理信号。

从命令列表中调用shell命令，直到所有命令都完成

4 个答案:

更新