使用@Zilog80 技术从我自己解决

Question

如何启动一些带有变量赋值的子shell并等待所有完成？

#!/bin/bash

#some code about $FILE="$1"

cat "$FILE" | while read -r HOST || [[ -n $HOST ]];
do
    echo "$HOST";
    URL="http://$HOST";  QUEST1=$(curl -Is --connect-timeout 200 --max-time 200 "$URL" | head -1);
    P1=$!
    URL="https://$HOST"; QUEST2=$(curl -Is --connect-timeout 200 --max-time 200 "$URL" | head -1);
    P2=$!
    
    echo "$P1 $P2"
    wait $P1 $P2
    R1=$( echo "$QUEST1" | grep -o " 200" );
    R2=$( echo "$QUEST2" | grep -o " 200" );
    echo "$R1 $R2"
    
    if [[ "$R1" || "$R2" ]]; then
    echo "FOUND!";
    fi

done

这不起作用。 echo "$P1 $P2" 是空的，因为我在 subshell 中。我希望从当代开始，这样我就不必在第一次完成后等待。

好的，这是一个基本问题，但我想了解如何将其应用于其他情况。请我不要外部文件。

编辑对于谁不明白。我想将 $QUEST1 和 $QUEST2 放在后台以加快时间和等待，而不使用额外的文件。我读了这么多，但没有解决问题。谢谢

Answer 1

这是评论的简历：

将子shell 输出分配给变量/使用子shell 输出(STDOUT) 意味着父shell 将等待所有子shell 子进程结束，即使有一个内部backgrounded 命令。

举个例子：

x=$( { { /bin/sleep 10 ; echo out1; echo out2; } | head -1; } & ); \
echo "Wrong child PID : $!"

这将阻止父 shell 十秒钟。但是在这里你得到了父壳 $! ，而不是子壳中定义的那个。要获得预期的 $!，您必须以某种方式将其传输到您的父 shell（通过 STDOUT、STDERR 或文件，或命名管道等）。您可以通过 STDOUT 来实现，例如：

subpid=$( { { { /bin/sleep 10 ; echo out1; echo out2; } | head -1; } 1>&2 & } ; \
echo $!)

在这里，当您的子 shell 将其命令输出发送到 STDERR 并且仅在 STDOUT 上输出子 PID $! 时，该命令将几乎立即执行（然后父 shell 不会阻塞 I/O）。

因为您希望尽可能避免 I/O，并且如果您只需要子 shell $! 来等待子进程，您可以依赖父 shell 将等待的事实 来自子shell的所有STDOUT输出。那么你的实际命令就足够了，不需要知道子shell $! :

URL="http://$HOST";  QUEST1=$(curl -Is --connect-timeout 200 --max-time 200 "$URL" \
| head -1);

但是，如果您需要知道子 shell 的子 PID（请注意，此 PID 将是此处shell pid，而不是 curl 或 head 命令）并等待子shell命令完成，然后您可以执行类似的操作以获得接近确定性的顺序< /strong>（如果您的子命令不包含至少一个管道，则不起作用）：

x=$( { spid=$( { { { /bin/sleep 10;echo out1;echo out2; }|head -1;} 1>&2 & };echo $!);} \ 2>&1 ; echo "SUBPID=$spid" )

这会在 x 后，十秒：SUBPID=<subshell child pid> out1。

此时，此 SUBPID 将不再存在（或不再是“您的”子 shell 子 pid），但您可以记录它或用它做任何您想做的事情。 >
您的命令将类似于：

URL="http://$HOST"; QUEST1=$( \ { subpid=$( { { curl -Is --connect-timeout 200 --max-time 200 "$URL" | head -1; \ } 1>&2 & } ;echo $!); } 2>&1 ; echo "SUBPID=$subpid" );

QUEST1 中的第一个条目应该是 SUBPID= 后跟 curl 第一行输出。

为了清楚地表明 shell 会等待，你可以在 google.com 里面用 10 秒 sleep 测试它：

URL="http://www.google.com"; QUEST1=$( { subpid=$( \ { { { curl -Is --connect-timeout 200 --max-time 200 "$URL"; sleep 10; } | head -1; \ } 1>&2 & };echo $!); } 2>&1 ; echo "SUBPID=$subpid" );

更新

在我们的交流之后，我知道您正在子 shell 中寻找异步 waitable 子进程，您需要在它完成时从中获取输出，所有这些无需使用临时文件也没有命名管道。

有一种解决方案不需要临时文件，不需要磁盘写入 I/O，并且基于 @hhtamas solution to create an anonymous fifo 用于匿名管道而不是命名管道。

首先，这里是此解决方案的一个简单示例，接着是您的用例的实现（许多 curl 通过子shell调用）。

示例解决方案：

#!/bin/bash # We use the bright solution from @htamas to create an anonymous pipe # in the fds of our current shell. # see: https://superuser.com/questions/184307/bash-create-anonymous-fifo # # # 1. Creating the anonymous pipe # # start a background pipeline with two processes running forever tail -f /dev/null | tail -f /dev/null & # save the process ids PID2=$! PID1=$(jobs -p %+) # hijack the pipe's file descriptors using procfs exec 3>/proc/"${PID1}"/fd/1 4</proc/"${PID2}"/fd/0 # kill the background processes we no longer need # (using disown suppresses the 'Terminated' message) disown $PID2 kill "${PID1}" "${PID2}" # anything we write to fd 3 can be read back from fd 4 # # 2. Launching an "asynchonous subshell" and get its output # # We set a flag to trap the async subshell termination through SIGHUP ready=0; trap "ready=1" SIGHUP; # We launch our subshell for the subprocess "sleep 10" with its output # connected to the standalone anonymous pipe. # As the sleep command as no output, we add "starting" and "finish". # Note that as we send the output elsewhere than STDOUT, it's non blocking # Note also that we send SIGHUP to our parent shell ($$) when the command finishs. x=$( { echo "starting"; sleep 10; echo "finish"; echo "EOF"; kill -SIGHUP $$; } >&3 & ) # We now wait that our subshell terminates, it will terminate within the sleep command. # Will waiting, we can do stuff. Here we just display "Waiting.." every seconds. while [ "${ready}" = "0" ]; do echo "waiting for subshell.."; sleep 1; done; # We close fd 3 early as we should no more output from the subshell exec 3>&- # We recover our subshell output from the out point of the autonomous pipe in y line="" y=$( while [ "${line}" != "EOF" ] ; do read -r -u 4 line; [ "${line}" != "EOF" ] && echo "${line}"; done ); # And display the output of the subshell echo "Subshell terminate, its output : "; echo "${y}" # close the file descriptors when we are finished (optional) exec 4<&-

此解决方案需要 /proc 文件系统，这在许多实际的 UNIX 上很常见。解释在脚本中作为注释提供。

小改动：更好的子外壳识别过程，等待时更多的进程信息，处理子外壳的潜在崩溃。

为您的用例实现：

#!/bin/bash # # Create the anonymous pipe. # # Parameters: None. # Returns: # 0 : Success. # 1 : Failed to launch tails. # 2 : Failed to exec. # 3 : Failed to kill tails process. function CreateAnonymousPipe() { # We use the bright solution from @htamas to create an anonymous pipe # in the fds of our current shell. # see: https://superuser.com/questions/184307/bash-create-anonymous-fifo # local pid1 local pid2 # start a background pipeline with two processes running forever tail -f /dev/null | tail -f /dev/null & [ $? != 0 ] && return 1; # save the process ids pid2=$! pid1=$(jobs -p %+) # hijack the pipe's file descriptors using procfs exec 3>/proc/"${pid1}"/fd/1 4</proc/"${pid2}"/fd/0 [ $? != 0 ] && return 2; # kill the background processes we no longer need # (using disown suppresses the 'Terminated' message) disown "${pid2}" kill "${pid1}" "${pid2}" [ $? != 0 ] && return 3; # anything we write to fd 3 can be read back from fd 4 return 0; } # # Launch asynchronuously a curl process in a subshell. # # Parameters: { URL } { indice } # URL : URL for the curl call. # indice : numeric identifier for this call # Returns: # 0 : Success. # 1 : Missing parameters # 2 : Failed to launch curl subprocess. # 3 : Failed to access /proc # STDOUT: PID of the corresponding subshell if success. function CallCurl() { if [ $# != 2 ] ; then echo "CallCurl: URL and indice parameter are mandatory." 1>&2 echo " CallCurl { URL } { indice }." 1>&2 return 1; fi [ ! -d /proc ] && return 3; local url="$1" local indice="$2" local subshell_PID # We launch our subshell for the subrprocess curl with its output # connected to the standalone anonymous pipe. # The curl process output is prefixed with its indice in the URL arrays. # Note that the subshell first renames itself with a specific identifier, # curl_<indice>, and that we escape $BASHPID to use its pid for that : # 1) We can't use $$ to get the subshell PID as it is not a shell variable that # can be evaluated at execution. As it is "immutable" from the shell point of # view, it'll be always evaluated at first expansion, thus the parent shell PID. # 2) We don't rename after subshell launch using $! as its PID, at this time the # subshell could have already terminated and its possible that another process # have since been launched with this PID. # Note that we send its output elsewhere than STDOUT (to >&3), so it's non blocking. # Note also that we send USR1 signal to our parent shell ($$) when the command finishs. subshell_PID=$( { { local my_pid; eval my_pid="\${BASHPID}"; printf 'curl_%s' "${indice}">/proc/"${my_pid}"/comm 2>/dev/null; curl -Is --connect-timeout 200 --max-time 200 "${url}" | head -1 | { read -r line; echo "${indice}: ${line}"; }; kill -USR1 $$; } >&3 & } ; echo $!; ) [ $? != 0 ] && return 2; echo "${subshell_PID}" return 0; } # # Main URL processor, launch curl subprocess asynchronuously. # # Parameters: { URL ... } # URL : URL to call with curl. # Returns: # 0 : Success. # 1 : URL parameter(s) missing # 2 : Failed to launch curl subprocess. # 2 : Failed to create anonymous pipe. # STDOUT: Processing and the outputs of the curl commands function CurlProcessor() { if [ $# = 0 ] ; then echo "CurlProcessor: URL parameter is mandatory." 1>&2 echo " CurlProcessor { URL ... }." 1>&2 return 1; fi local indice=0 local isalive=0 local -a URLarray # Feed the URL array while [ $# -gt 0 ] ; do URLarray+=("$1"); shift; done # Initialize a set of flags for each URL local -a ready for ((indice=0; indice < ${#URLarray[@]}; indice++)) ; do ready+=(0); done # Initialize an array of subshell PID for each URL to monitor local -a pid for ((indice=0; indice < ${#URLarray[@]}; indice++)) ; do pid+=(0); done # Initialize an array of subshell output for each URL declare -a output for ((indice=0; indice < ${#URLarray[@]}; indice++)) ; do output+=(""); done # We create the anonymous pipe CreateAnonymousPipe [ $? != 0 ] && return 3; # Set a trap to catch USR1 and check which subshell are still alive through /proc # Local handler for the signals function trap_handler() { for indice in "${!pid[@]}" ; do if [ "${pid[${indice}]}" != "0" ] ; then isalive="$(cat /proc/"${pid[${indice}]}"/comm 2>/dev/null)" 2>/dev/null; [ "${isalive}" != "curl_${indice}" ] && ready[${indice}]=1; fi done } trap trap_handler USR1 2>/dev/null; # Now launch all the subshell for ((indice=0; indice < ${#URLarray[@]}; indice++)) ; do pid[${indice}]=$(CallCurl "${URLarray[${indice}]}" "${indice}"); [ $? != 0 ] && return 2; done # We now wait that our subshells terminate. # Will waiting, we can do stuff. Here we just display "Waiting.." every seconds. local all_finished=0 local num_finished=0 local last_num_finished=0 local direct_check_timer=0 while [ "${all_finished}" = "0" ]; do # We check each URL subshell flag and loop till there is at least one unfinished. all_finished=1 num_finished=0 for ((indice=0; indice < ${#ready[@]}; indice++)) ; do if [ ${ready[${indice}]} = 0 ] ; then all_finished=0; else ((num_finished++)); fi done echo "waiting for subshells.. ${num_finished}/${#ready[@]} finished."; sleep 1; # In case one or more subshell have crashed and thus wont send the USR1 signal, # we launch here the handler to check the states of the subshells after 5sec # if there is no subshell termination in the interval. if [ "${all_finished}" = "0" ] ; then if [ "${last_num_finished}" = "${num_finished}" ] ; then ((direct_check_timer++)) if [ "${direct_check_timer}" = "5" ] ; then echo "More than 5 seconds with no progress, doing a direct check." direct_check_timer=0 trap_handler fi else direct_check_timer=0 fi fi last_num_finished="${num_finished}" done; # All subshell have finished, we send EOF in the autonaumous pipe echo "EOF" >&3 # We close fd 3 early exec 3>&- # We recover our subshells outputs from the out point of the autonomous pipe local line="" local control="" while [ "${line}" != "EOF" ] ; do read -r -u 4 line; if [ "${line}" != "EOF" ] ; then # Each line should have "indice: " as a prefix to identify the URL associated indice="${line/: */}" if [ "${indice}" ] ; then control="${indice/[0-9]*/}" if [ "${control}" = "" ] ; then if [ "${output[${indice}]}" != "" ] ; then output[${indice}]="${output[${indice}]}\n${line/[0-9]*: /}" else output[${indice}]="${line/[0-9]*: /}" fi fi fi fi done # close the file descriptors when we are finished (optional) exec 4<&- # And display the output of the subshells echo "Subshells have all terminated, the output : "; for ((indice=0; indice < ${#URLarray[@]}; indice++)) ; do echo "Output from URL ${URLarray[${indice}]} :" echo "${output[${indice}]}" done return 0; } # # An example call of CurlProcessor # CurlProcessor "http://www.google.com" "http://stackoverflow.com/" "http://en.cppreference.com/"

通过示例调用，您将获得以下输出：

waiting for subshells.. 0/3 finished. waiting for subshells.. 3/3 finished. Subshells have all terminated, the output : Output from URL http://www.google.com : HTTP/1.1 200 OK Output from URL http://stackoverflow.com/ : HTTP/1.1 301 Moved Permanently Output from URL http://en.cppreference.com/ : HTTP/1.1 302 Found

当 fastly is down 时，你会得到：

waiting for subshells.. 0/3 finished. waiting for subshells.. 2/3 finished. waiting for subshells.. 2/3 finished. waiting for subshells.. 2/3 finished. waiting for subshells.. 2/3 finished. waiting for subshells.. 2/3 finished. waiting for subshells.. 2/3 finished. More than 5 seconds with no progress, doing a direct check. waiting for subshells.. 2/3 finished. waiting for subshells.. 2/3 finished. waiting for subshells.. 2/3 finished. waiting for subshells.. 2/3 finished. waiting for subshells.. 2/3 finished. More than 5 seconds with no progress, doing a direct check. waiting for subshells.. 2/3 finished. waiting for subshells.. 2/3 finished. waiting for subshells.. 2/3 finished. waiting for subshells.. 2/3 finished. waiting for subshells.. 2/3 finished. More than 5 seconds with no progress, doing a direct check. waiting for subshells.. 3/3 finished. Subshells have all terminated, the output : Output from URL http://www.google.com : HTTP/1.1 200 OK Output from URL http://stackoverflow.com/ : HTTP/1.1 503 Backend unavailable, connection timeout Output from URL http://en.cppreference.com/ : HTTP/1.1 302 Found

（测试脚本的最佳时间^^。）

Answer 2

使用@Zilog80 技术从我自己解决

====================================

编辑：

这个帖子全错了。

需要的外部文件。

====================================

附言看看他的回答就明白了（关于 STDOUT 1>&2 的重要部分，这样你就不会创建错误的孩子）

我知道在 subshell ( ) 中，您可以分配从外部脚本中读出的变量。

用大括号对代码块进行分组，没有子shell。 curl 中的命令是匿名函数 { }。（参考 https://tldp.org）

所以这个时候你可以写

( export QUEST2=$(curl -Is --connect-timeout 2 --max-time 2 $URL | head -1) ) 2>&1 & P2=$!

我有子shell的PID和任务完成。

所有代码将是：


#!/bin/bash

FILE="$1"

cat "$FILE" | while read -r HOST || [[ -n $HOST ]];
do
    echo "$HOST";
    URL="http://$HOST";  ( export QUEST1=$(curl -Is --connect-timeout 2 --max-time 2 $URL | head -1) ) 2>&1 & P1=$!
    URL="https://$HOST"; ( export QUEST2=$(curl -Is --connect-timeout 2 --max-time 2 $URL | head -1) ) 2>&1 & P2=$!

    
    echo "$P1 $P2"
    wait $P1 $P2
    R1=$( echo "$QUEST1" | grep -o " 200" );
    R2=$( echo "$QUEST2" | grep -o " 200" );
    echo "$R1 $R2"
    
    if [[ "$R1" || "$R2" ]]; then
    echo "FOUND!";
    fi

done

其中 $FILE 是 textfile.txt，其中包含一个类似主机/IP 的列表

google.com
software.net
hacking.org
nasa.gov

现在您可以启动您的脚本来尝试站点是否启用了 http 或 https 协议

(USELIKE NO OTHER :() --> 了解如何分叉好

TNKS 到 @Zilog80。

使用分配启动子shell并等待

2 个答案:

更新

示例解决方案：

为您的用例实现：

使用@Zilog80 技术从我自己解决

====================================

编辑：

这个帖子全错了。

需要的外部文件。

====================================