Question

我正在寻找有关如何使用我的单个url wget脚本并从文本文件中实现URL列表的一些提示。我不知道如何编写脚本 - 循环或以某种方式枚举它？这是我用来从单个页面收集所有内容的代码：

wget \
    --recursive \
    --no-clobber \
    --page-requisites \
    --html-extension \
    --convert-links \
    --restrict-file-names=windows \
    --domains example.com \
    --no-parent \
        http://www.example.com/folder1/folder/

效果非常好 - 我只是迷失了如何使用列出的网址list.txt，如：

http://www.example.com/folder1/folder/
http://www.example.com/sports1/events/
http://www.example.com/milfs21/delete/
...

我认为这很简单，但是再一次从来不知道，谢谢。

Answer 1

根据wget --help：

   -i file
   --input-file=file
       Read URLs from a local or external file.  If - is specified as
       file, URLs are read from the standard input.  (Use ./- to read from
       a file literally named -.)

另一种方法是在从文件中读取列表时使用循环：

readarray -t LIST < list.txt

for URL in "${LIST[@]}"; do
    wget \
        --recursive \
        --no-clobber \
        --page-requisites \
        --html-extension \
        --convert-links \
        --restrict-file-names=windows \
        --domains example.com \
        --no-parent \
        "$URL"
done

类似地，使用while read循环将适用。

Wget从网址列表中完成网页

1 个答案: