Question

我正在尝试新的东西，我通常会在C＃或VB中执行此操作。但出于速度原因，我想在我的服务器上执行此操作。

打开文件terms.txt
从terms.txt一次取一个项目并打开网址（可能是卷曲或其他内容）并转到http://system.com/set=terms
查看HTML源并提取图片名称（stringB）。寻找image = StringB ＆amp; location
将 StringB 保存到imgname.txt
关闭文件并循环到terms.txt

我在看 sed ，但我相信 awk 可能是最好的方式？对我来说，构建像这样的命令在shell下运行对我来说是全新的。我熟悉使用linux只需要帮助命令。

Answer 1

不完全不同的东西应该做ya，这取决于terms.txt的精确格式（shell脚本最好每行一个条目）以及你是否真的需要解析HTML（我希望你不要）：

#! /bin/sh

if [ $# -ne 2 ]; then
    echo "usage: $0 termfile baseurl" >&2
    exit 1
fi
termfile="$1"
baseurl="$2"

while read term; do
    wget -q -O- "$baseurl/set=$term" |
      sed -ne 's/^.*image=\([^&]*\)&.*$/\1/p'
done < "$termfile"

将此保存到名为“extractimages”的文件中，chmod + x it，然后像这样运行：

$ ./extractimages terms.txt http://system.com > imgname.txt

Answer 2

sed 's|^.*$|wget -q -O- http:\/\/system.com/set=&|' file | bash |sed -ne 's/^.*image=\([^&]*\)&.*$/\1/p'

在HTML代码中查找并复制字符串

2 个答案: