这是一个用于HTTP状态代码的简单bash脚本
while read url
do
urlstatus=$(curl -o /dev/null --silent --head --write-out '%{http_code}' "${url}" --max-time 5 )
echo "$url $urlstatus" >> urlstatus.txt
done < $1
我正在从文本文件中读取URL,但它一次只处理一个,花费太多时间,GNU parallel和xargs也会一次处理一行(测试)
如何处理同步网址以进行处理以改善时间安排?换句话说,URL文件的线程而不是bash命令(GNU parallel和xargs做)
Input file is txt file and lines are separated as
ABC.Com
Bcd.Com
Any.Google.Com
Something like this
答案 0 :(得分:2)
GNU parallel和xargs也处理一行(测试)
你能举个例子吗?如果您使用-j
,那么您应该能够一次运行多个进程。
我会这样写:
doit() {
url="$1"
urlstatus=$(curl -o /dev/null --silent --head --write-out '%{http_code}' "${url}" --max-time 5 )
echo "$url $urlstatus"
}
export -f doit
cat "$1" | parallel -j0 -k doit >> urlstatus.txt
根据输入:
Input file is txt file and lines are separated as
ABC.Com
Bcd.Com
Any.Google.Com
Something like this
www.google.com
pi.dk
我得到了输出:
Input file is txt file and lines are separated as 000
ABC.Com 301
Bcd.Com 301
Any.Google.Com 000
Something like this 000
www.google.com 302
pi.dk 200
看起来是对的:
000 if domain does not exist
301/302 for redirection
200 for success