Question

我想在linux中检查100000k + url。

关于那些链接，实际上是我的android的OTA [zip]。

在这些链接中只有一个有效的链接，其中有404错误。

那么如何在linux服务器或web服务器[apache]中的较短时间内检查所有链接。

网址结构：

http://link.com/updateOTA_1.zip

http://link.com/updateOTA_2.zip

http://link.com/updateOTA_999999999.zip

好的，我试过

我制作了这个剧本，但它真的很慢。 http://pastebin.com/KVxnzttA我还将线程增加到500，然后我的服务器崩溃了：[

#!/bin/bash
for a in {1487054155500..1487055000000}
do
  if [ $((a%50)) = 0 ]
    then
    curl -s -I http://link.com/updateOTA_$((a)).zip | head -n1 & 
    curl -s -I http://link.com/updateOTA_$((a+1)).zip | head -n1 &
    curl -s -I http://link.com/updateOTA_$((a+2)).zip | head -n1 &
    curl -s -I http://link.com/updateOTA_$((a+3)).zip | head -n1 &
    curl -s -I http://link.com/updateOTA_$((a+4)).zip | head -n1 &
...
    curl -s -I http://link.com/updateOTA_$((a+49)).zip | head -n1 &
    curl -s -I http://link.com/updateOTA_$((a+50)).zip | head -n1
    wait
    echo "$((a))"
  fi
done

我尝试使用aria2，但aria2上的最高线程是16，所以再次失败。

尝试了一些在线工具，但他们给了我100url限制。

Answer 1

运行curl 100,000次以上会很慢。相反，将批量URL写入单个curl实例，以减少启动curl的开销。

# This loop doesn't require pre-generating a list of a million integers
for ((a=1487054155500; a<=1487055000000; a+=50)); do
  for(k=0; k<50; k++)); do
    printf 'url = %s\n' "http://link.com/updateOTA_$((a+k)).zip"
  done | curl -I -K - -w 'result: %{http_code} %{url_effective}' | grep -F 'result:' > batch-$a.txt
done

如果您需要，-w选项用于生成将每个网址与其结果相关联的输出。

Answer 2

然而我找到了使用aria2c的解决方案

现在它每分钟扫描7k网址。

感谢所有

aria2c -i url -s16 -x16 --max-concurrent-downloads=1000

检查10000K + URL

2 个答案: