Question

这是一个用于HTTP状态代码的简单bash脚本

while read url
    do
        urlstatus=$(curl -o /dev/null --silent --head --write-out  '%{http_code}' "${url}" --max-time 5 )
        echo "$url  $urlstatus" >> urlstatus.txt
    done < $1

我正在从文本文件中读取URL，但它一次只处理一个，花费太多时间，GNU parallel和xargs也会一次处理一行（测试）

如何处理同步网址以进行处理以改善时间安排？换句话说，URL文件的线程而不是bash命令（GNU parallel和xargs做）

 Input file is txt file and lines are separated  as
    ABC.Com
    Bcd.Com
    Any.Google.Com

Something  like this

Answer 1

GNU parallel和xargs也处理一行（测试）

你能举个例子吗？如果您使用-j，那么您应该能够一次运行多个进程。

我会这样写：

doit() {
    url="$1"
    urlstatus=$(curl -o /dev/null --silent --head --write-out  '%{http_code}' "${url}" --max-time 5 )
    echo "$url  $urlstatus"
}
export -f doit
cat "$1" | parallel -j0 -k doit >> urlstatus.txt

根据输入：

Input file is txt file and lines are separated  as
ABC.Com
Bcd.Com
Any.Google.Com
Something  like this
www.google.com
pi.dk

我得到了输出：

Input file is txt file and lines are separated  as  000
ABC.Com  301
Bcd.Com  301
Any.Google.Com  000
Something  like this  000
www.google.com  302
pi.dk  200

看起来是对的：

000 if domain does not exist
301/302 for redirection
200 for success

从bash中的txt文件中多次读取（并行处理）

1 个答案: