我想用wget下载从网站主页链接的文件,但我只想下载text / html文件。是否可以根据mime内容类型将wget限制为text / html文件?
答案 0 :(得分:1)
我不认为他们已经实现了这一点。因为它仍然存在错误列表。
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=21148
您可能需要通过文件扩展名
执行所有操作答案 1 :(得分:0)
Wget2具有此功能。
--filter-mime-type Specify a list of mime types to be saved or ignored`
### `--filter-mime-type=list`
Specify a comma-separated list of MIME types that will be downloaded. Elements of list may contain wildcards.
If a MIME type starts with the character '!' it won't be downloaded, this is useful when trying to download
something with exceptions. For example, download everything except images:
wget2 -r https://<site>/<document> --filter-mime-type=*,\!image/*
It is also useful to download files that are compatible with an application of your system. For instance,
download every file that is compatible with LibreOffice Writer from a website using the recursive mode:
wget2 -r https://<site>/<document> --filter-mime-type=$(sed -r '/^MimeType=/!d;s/^MimeType=//;s/;/,/g' /usr/share/applications/libreoffice-writer.desktop)
Wget2截至今天尚未发布,但很快就会发布。 Debian不稳定版已经发布了Alpha版本。
查看https://gitlab.com/gnuwget/wget2了解更多信息。您可以直接将问题/评论发布到bug-wget@gnu.org。