如何启动一些带有变量赋值的子shell并等待所有完成?
#!/bin/bash
#some code about $FILE="$1"
cat "$FILE" | while read -r HOST || [[ -n $HOST ]];
do
echo "$HOST";
URL="http://$HOST"; QUEST1=$(curl -Is --connect-timeout 200 --max-time 200 "$URL" | head -1);
P1=$!
URL="https://$HOST"; QUEST2=$(curl -Is --connect-timeout 200 --max-time 200 "$URL" | head -1);
P2=$!
echo "$P1 $P2"
wait $P1 $P2
R1=$( echo "$QUEST1" | grep -o " 200" );
R2=$( echo "$QUEST2" | grep -o " 200" );
echo "$R1 $R2"
if [[ "$R1" || "$R2" ]]; then
echo "FOUND!";
fi
done
这不起作用。 echo "$P1 $P2"
是空的,因为我在 subshell 中。我希望从当代开始,这样我就不必在第一次完成后等待。
好的,这是一个基本问题,但我想了解如何将其应用于其他情况。请我不要外部文件。
编辑
对于谁不明白。我想将 $QUEST1
和 $QUEST2
放在后台以加快时间和等待,而不使用额外的文件。我读了这么多,但没有解决问题。谢谢
答案 0 :(得分:3)
这是评论的简历:
将子shell 输出分配给变量/使用子shell 输出(STDOUT) 意味着父shell 将等待所有子shell 子进程结束,即使有一个内部backgrounded 命令。
举个例子:
x=$( { { /bin/sleep 10 ; echo out1; echo out2; } | head -1; } & ); \
echo "Wrong child PID : $!"
这将阻止父 shell 十秒钟。但是在这里你得到了父壳 $!
,而不是子壳中定义的那个。要获得预期的 $!
,您必须以某种方式将其传输到您的父 shell(通过 STDOUT、STDERR 或文件,或命名管道等)。您可以通过 STDOUT 来实现,例如:
subpid=$( { { { /bin/sleep 10 ; echo out1; echo out2; } | head -1; } 1>&2 & } ; \
echo $!)
在这里,当您的子 shell 将其命令输出发送到 STDERR 并且仅在 STDOUT 上输出子 PID $!
时,该命令将几乎立即执行(然后父 shell 不会阻塞 I/O)。
因为您希望尽可能避免 I/O,并且如果您只需要子 shell $!
来等待子进程,您可以依赖父 shell 将等待的事实 来自子shell的所有STDOUT输出。那么你的实际命令就足够了,不需要知道子shell $!
:
URL="http://$HOST"; QUEST1=$(curl -Is --connect-timeout 200 --max-time 200 "$URL" \
| head -1);
但是,如果您需要知道子 shell 的子 PID(请注意,此 PID 将是 此处shell pid,而不是 curl 或 head 命令)并等待子shell命令完成,然后您可以执行类似的操作以获得接近确定性的顺序< /strong>(如果您的子命令不包含至少一个管道,则不起作用):
x=$( { spid=$( { { { /bin/sleep 10;echo out1;echo out2; }|head -1;} 1>&2 & };echo $!);} \
2>&1 ; echo "SUBPID=$spid" )
这会在 x
后,十秒:SUBPID=<subshell child pid> out1
。
此时,此 SUBPID 将不再存在(或不再是“您的”子 shell 子 pid),但您可以记录它或用它做任何您想做的事情。 >
您的命令将类似于:
URL="http://$HOST"; QUEST1=$( \
{ subpid=$( { { curl -Is --connect-timeout 200 --max-time 200 "$URL" | head -1; \
} 1>&2 & } ;echo $!); } 2>&1 ; echo "SUBPID=$subpid" );
QUEST1
中的第一个条目应该是 SUBPID=
为了清楚地表明 shell 会等待,你可以在 google.com 里面用 10 秒 sleep 测试它:
URL="http://www.google.com"; QUEST1=$( { subpid=$( \
{ { { curl -Is --connect-timeout 200 --max-time 200 "$URL"; sleep 10; } | head -1; \
} 1>&2 & };echo $!); } 2>&1 ; echo "SUBPID=$subpid" );
在我们的交流之后,我知道您正在子 shell 中寻找异步 waitable 子进程,您需要在它完成时从中获取输出,所有这些无需使用临时文件也没有命名管道。
有一种解决方案不需要临时文件,不需要磁盘写入 I/O,并且基于 @hhtamas solution to create an anonymous fifo 用于匿名管道而不是命名管道。
首先,这里是此解决方案的一个简单示例,接着是您的用例的实现(许多 curl 通过子shell调用)。
#!/bin/bash
# We use the bright solution from @htamas to create an anonymous pipe
# in the fds of our current shell.
# see: https://superuser.com/questions/184307/bash-create-anonymous-fifo
#
#
# 1. Creating the anonymous pipe
#
# start a background pipeline with two processes running forever
tail -f /dev/null | tail -f /dev/null &
# save the process ids
PID2=$!
PID1=$(jobs -p %+)
# hijack the pipe's file descriptors using procfs
exec 3>/proc/"${PID1}"/fd/1 4</proc/"${PID2}"/fd/0
# kill the background processes we no longer need
# (using disown suppresses the 'Terminated' message)
disown $PID2
kill "${PID1}" "${PID2}"
# anything we write to fd 3 can be read back from fd 4
#
# 2. Launching an "asynchonous subshell" and get its output
#
# We set a flag to trap the async subshell termination through SIGHUP
ready=0;
trap "ready=1" SIGHUP;
# We launch our subshell for the subprocess "sleep 10" with its output
# connected to the standalone anonymous pipe.
# As the sleep command as no output, we add "starting" and "finish".
# Note that as we send the output elsewhere than STDOUT, it's non blocking
# Note also that we send SIGHUP to our parent shell ($$) when the command finishs.
x=$( { echo "starting"; sleep 10; echo "finish"; echo "EOF"; kill -SIGHUP $$; } >&3 & )
# We now wait that our subshell terminates, it will terminate within the sleep command.
# Will waiting, we can do stuff. Here we just display "Waiting.." every seconds.
while [ "${ready}" = "0" ]; do
echo "waiting for subshell..";
sleep 1;
done;
# We close fd 3 early as we should no more output from the subshell
exec 3>&-
# We recover our subshell output from the out point of the autonomous pipe in y
line=""
y=$( while [ "${line}" != "EOF" ] ; do
read -r -u 4 line;
[ "${line}" != "EOF" ] && echo "${line}";
done );
# And display the output of the subshell
echo "Subshell terminate, its output : ";
echo "${y}"
# close the file descriptors when we are finished (optional)
exec 4<&-
此解决方案需要 /proc
文件系统,这在许多实际的 UNIX 上很常见。解释在脚本中作为注释提供。
小改动:更好的子外壳识别过程,等待时更多的进程信息,处理子外壳的潜在崩溃。
#!/bin/bash
#
# Create the anonymous pipe.
#
# Parameters: None.
# Returns:
# 0 : Success.
# 1 : Failed to launch tails.
# 2 : Failed to exec.
# 3 : Failed to kill tails process.
function CreateAnonymousPipe() {
# We use the bright solution from @htamas to create an anonymous pipe
# in the fds of our current shell.
# see: https://superuser.com/questions/184307/bash-create-anonymous-fifo
#
local pid1
local pid2
# start a background pipeline with two processes running forever
tail -f /dev/null | tail -f /dev/null &
[ $? != 0 ] && return 1;
# save the process ids
pid2=$!
pid1=$(jobs -p %+)
# hijack the pipe's file descriptors using procfs
exec 3>/proc/"${pid1}"/fd/1 4</proc/"${pid2}"/fd/0
[ $? != 0 ] && return 2;
# kill the background processes we no longer need
# (using disown suppresses the 'Terminated' message)
disown "${pid2}"
kill "${pid1}" "${pid2}"
[ $? != 0 ] && return 3;
# anything we write to fd 3 can be read back from fd 4
return 0;
}
#
# Launch asynchronuously a curl process in a subshell.
#
# Parameters: { URL } { indice }
# URL : URL for the curl call.
# indice : numeric identifier for this call
# Returns:
# 0 : Success.
# 1 : Missing parameters
# 2 : Failed to launch curl subprocess.
# 3 : Failed to access /proc
# STDOUT: PID of the corresponding subshell if success.
function CallCurl() {
if [ $# != 2 ] ; then
echo "CallCurl: URL and indice parameter are mandatory." 1>&2
echo " CallCurl { URL } { indice }." 1>&2
return 1;
fi
[ ! -d /proc ] && return 3;
local url="$1"
local indice="$2"
local subshell_PID
# We launch our subshell for the subrprocess curl with its output
# connected to the standalone anonymous pipe.
# The curl process output is prefixed with its indice in the URL arrays.
# Note that the subshell first renames itself with a specific identifier,
# curl_<indice>, and that we escape $BASHPID to use its pid for that :
# 1) We can't use $$ to get the subshell PID as it is not a shell variable that
# can be evaluated at execution. As it is "immutable" from the shell point of
# view, it'll be always evaluated at first expansion, thus the parent shell PID.
# 2) We don't rename after subshell launch using $! as its PID, at this time the
# subshell could have already terminated and its possible that another process
# have since been launched with this PID.
# Note that we send its output elsewhere than STDOUT (to >&3), so it's non blocking.
# Note also that we send USR1 signal to our parent shell ($$) when the command finishs.
subshell_PID=$( { { local my_pid;
eval my_pid="\${BASHPID}";
printf 'curl_%s' "${indice}">/proc/"${my_pid}"/comm 2>/dev/null;
curl -Is --connect-timeout 200 --max-time 200 "${url}" | head -1 |
{ read -r line; echo "${indice}: ${line}"; };
kill -USR1 $$;
} >&3 &
} ;
echo $!; )
[ $? != 0 ] && return 2;
echo "${subshell_PID}"
return 0;
}
#
# Main URL processor, launch curl subprocess asynchronuously.
#
# Parameters: { URL ... }
# URL : URL to call with curl.
# Returns:
# 0 : Success.
# 1 : URL parameter(s) missing
# 2 : Failed to launch curl subprocess.
# 2 : Failed to create anonymous pipe.
# STDOUT: Processing and the outputs of the curl commands
function CurlProcessor() {
if [ $# = 0 ] ; then
echo "CurlProcessor: URL parameter is mandatory." 1>&2
echo " CurlProcessor { URL ... }." 1>&2
return 1;
fi
local indice=0
local isalive=0
local -a URLarray
# Feed the URL array
while [ $# -gt 0 ] ; do URLarray+=("$1"); shift; done
# Initialize a set of flags for each URL
local -a ready
for ((indice=0; indice < ${#URLarray[@]}; indice++)) ; do ready+=(0); done
# Initialize an array of subshell PID for each URL to monitor
local -a pid
for ((indice=0; indice < ${#URLarray[@]}; indice++)) ; do pid+=(0); done
# Initialize an array of subshell output for each URL
declare -a output
for ((indice=0; indice < ${#URLarray[@]}; indice++)) ; do output+=(""); done
# We create the anonymous pipe
CreateAnonymousPipe
[ $? != 0 ] && return 3;
# Set a trap to catch USR1 and check which subshell are still alive through /proc
# Local handler for the signals
function trap_handler() {
for indice in "${!pid[@]}" ; do
if [ "${pid[${indice}]}" != "0" ] ; then
isalive="$(cat /proc/"${pid[${indice}]}"/comm 2>/dev/null)" 2>/dev/null;
[ "${isalive}" != "curl_${indice}" ] && ready[${indice}]=1;
fi
done
}
trap trap_handler USR1 2>/dev/null;
# Now launch all the subshell
for ((indice=0; indice < ${#URLarray[@]}; indice++)) ; do
pid[${indice}]=$(CallCurl "${URLarray[${indice}]}" "${indice}");
[ $? != 0 ] && return 2;
done
# We now wait that our subshells terminate.
# Will waiting, we can do stuff. Here we just display "Waiting.." every seconds.
local all_finished=0
local num_finished=0
local last_num_finished=0
local direct_check_timer=0
while [ "${all_finished}" = "0" ]; do
# We check each URL subshell flag and loop till there is at least one unfinished.
all_finished=1
num_finished=0
for ((indice=0; indice < ${#ready[@]}; indice++)) ; do
if [ ${ready[${indice}]} = 0 ] ; then
all_finished=0;
else
((num_finished++));
fi
done
echo "waiting for subshells.. ${num_finished}/${#ready[@]} finished.";
sleep 1;
# In case one or more subshell have crashed and thus wont send the USR1 signal,
# we launch here the handler to check the states of the subshells after 5sec
# if there is no subshell termination in the interval.
if [ "${all_finished}" = "0" ] ; then
if [ "${last_num_finished}" = "${num_finished}" ] ; then
((direct_check_timer++))
if [ "${direct_check_timer}" = "5" ] ; then
echo "More than 5 seconds with no progress, doing a direct check."
direct_check_timer=0
trap_handler
fi
else
direct_check_timer=0
fi
fi
last_num_finished="${num_finished}"
done;
# All subshell have finished, we send EOF in the autonaumous pipe
echo "EOF" >&3
# We close fd 3 early
exec 3>&-
# We recover our subshells outputs from the out point of the autonomous pipe
local line=""
local control=""
while [ "${line}" != "EOF" ] ; do
read -r -u 4 line;
if [ "${line}" != "EOF" ] ; then
# Each line should have "indice: " as a prefix to identify the URL associated
indice="${line/: */}"
if [ "${indice}" ] ; then
control="${indice/[0-9]*/}"
if [ "${control}" = "" ] ; then
if [ "${output[${indice}]}" != "" ] ; then
output[${indice}]="${output[${indice}]}\n${line/[0-9]*: /}"
else
output[${indice}]="${line/[0-9]*: /}"
fi
fi
fi
fi
done
# close the file descriptors when we are finished (optional)
exec 4<&-
# And display the output of the subshells
echo "Subshells have all terminated, the output : ";
for ((indice=0; indice < ${#URLarray[@]}; indice++)) ; do
echo "Output from URL ${URLarray[${indice}]} :"
echo "${output[${indice}]}"
done
return 0;
}
#
# An example call of CurlProcessor
#
CurlProcessor "http://www.google.com" "http://stackoverflow.com/" "http://en.cppreference.com/"
通过示例调用,您将获得以下输出:
waiting for subshells.. 0/3 finished. waiting for subshells.. 3/3 finished. Subshells have all terminated, the output : Output from URL http://www.google.com : HTTP/1.1 200 OK Output from URL http://stackoverflow.com/ : HTTP/1.1 301 Moved Permanently Output from URL http://en.cppreference.com/ : HTTP/1.1 302 Found
当 fastly is down 时,你会得到:
waiting for subshells.. 0/3 finished. waiting for subshells.. 2/3 finished. waiting for subshells.. 2/3 finished. waiting for subshells.. 2/3 finished. waiting for subshells.. 2/3 finished. waiting for subshells.. 2/3 finished. waiting for subshells.. 2/3 finished. More than 5 seconds with no progress, doing a direct check. waiting for subshells.. 2/3 finished. waiting for subshells.. 2/3 finished. waiting for subshells.. 2/3 finished. waiting for subshells.. 2/3 finished. waiting for subshells.. 2/3 finished. More than 5 seconds with no progress, doing a direct check. waiting for subshells.. 2/3 finished. waiting for subshells.. 2/3 finished. waiting for subshells.. 2/3 finished. waiting for subshells.. 2/3 finished. waiting for subshells.. 2/3 finished. More than 5 seconds with no progress, doing a direct check. waiting for subshells.. 3/3 finished. Subshells have all terminated, the output : Output from URL http://www.google.com : HTTP/1.1 200 OK Output from URL http://stackoverflow.com/ : HTTP/1.1 503 Backend unavailable, connection timeout Output from URL http://en.cppreference.com/ : HTTP/1.1 302 Found
(测试脚本的最佳时间^^。)
答案 1 :(得分:0)
附言看看他的回答就明白了(关于 STDOUT 1>&2 的重要部分,这样你就不会创建错误的孩子)
我知道在 subshell ( )
中,您可以分配从外部脚本中读出的变量。
用大括号对代码块进行分组,没有子shell。 curl 中的命令是匿名函数 { }
。 (参考 https://tldp.org)
所以这个时候你可以写
( export QUEST2=$(curl -Is --connect-timeout 2 --max-time 2 $URL | head -1) ) 2>&1 & P2=$!
我有子shell的PID和任务完成。
所有代码将是:
#!/bin/bash
FILE="$1"
cat "$FILE" | while read -r HOST || [[ -n $HOST ]];
do
echo "$HOST";
URL="http://$HOST"; ( export QUEST1=$(curl -Is --connect-timeout 2 --max-time 2 $URL | head -1) ) 2>&1 & P1=$!
URL="https://$HOST"; ( export QUEST2=$(curl -Is --connect-timeout 2 --max-time 2 $URL | head -1) ) 2>&1 & P2=$!
echo "$P1 $P2"
wait $P1 $P2
R1=$( echo "$QUEST1" | grep -o " 200" );
R2=$( echo "$QUEST2" | grep -o " 200" );
echo "$R1 $R2"
if [[ "$R1" || "$R2" ]]; then
echo "FOUND!";
fi
done
其中 $FILE
是 textfile.txt,其中包含一个类似主机/IP 的列表
google.com
software.net
hacking.org
nasa.gov
现在您可以启动您的脚本来尝试站点是否启用了 http 或 https 协议
(USELIKE NO OTHER :() --> 了解如何分叉好
TNKS 到 @Zilog80。