Question

我一直在寻找一段时间，并且找不到解决问题的方法。

我有一个用sed '162!d' skinlist.html提取的HTML文件中的一行，其中包含文本

<a href="/skin/dwarf-red-beard-734/" title="Dwarf Red Beard">。

我想提取文字Dwarf Red Beard，但该文字是模块化的（可以更改），因此我想在title="和"之间提取文字。

对于我的生活，我不能弄清楚如何做到这一点。

Answer 1

awk 'NR==162 {print $4}' FS='"' skinlist.html

将字段分隔符设置为"
仅打印第162行
print field 4

Answer 2

sed中的解决方案

sed -n '162 s/^.*title="\(.*\)".*$/\1/p' skinlist.html

在162中提取行skinlist.html，并在title中捕获\1属性内容。

Answer 3

shell的变量扩展语法允许您从字符串中修剪前缀和后缀：

line="$(sed '162!d' skinlist.html)"   # extract the relevant line from the file
temp="${line#* title=\"}"    # remove from the beginning through the first match of ' title="'
if [ "$temp" = "$line" ]; then
    echo "title not found in '$line'" >&2
else
    title="${temp%%\"*}"   # remote from the first '"' through the end
fi

Answer 4

您可以将其传递给另一个sed或向sed -e 's/.*title="//g' -e 's/">.*$//g'添加表达式，例如{{1}}

Answer 5

也是sed

sed -n '162 s/.*"\([a-zA-Z ]*\)"./\1/p' skinlist.html

在同一行上的两个字符串之间打印文本

5 个答案: