Question

我尝试在标签中提取信息：

<div class="rpt_price rpt_price_1">THE TEXT</div>

使用此命令：

t=$(curl -v --silent http://somewebsite.info/ 2>&1 | grep -E "^<div class=\"rpt_price rpt_price_1\">.*</div>$"); echo $t

应返回THE TEXT，但它没有回应。我的错是什么？提前谢谢！

Answer 1

您尚未转发/中的最后一个</div>。

正确的正则表达式应如下：

^<div class=\"rpt_price rpt_price_1\">.*<\/div>$

对于正则表达式，this是一个很好的工具，用于测试创建结果时的结果。

Answer 2

以下作品使用：

grep -Po "<div class=\"rpt_price rpt_price_1\">\K(.*)(?=</div>$)"

此处unix stackexchange描述了-P和-o选项。在unix stackexchange上也很好地解释了\K。

\K(.*)的使用仅输出匹配项和以下文字。使用(?=...$)我排除了文本的其余部分。

<强>测试

echo "<div class=\"rpt_price rpt_price_1\">THE TEXT</div>" | grep -Po "<div class=\"rpt_price rpt_price_1\">\K(.*)(?=</div>$)"

<强>输出：

THE TEXT

另一种可能性是直接使用perl，解释为here on superuser：

perl -ne 'print $1 if /\<div class="rpt_price rpt_price_1">(.*?)\<\/div>/s'

<强>测试

echo "<div class=\"rpt_price rpt_price_1\">THE TEXT</div>" | perl -ne 'print $1 if /\<div class="rpt_price rpt_price_1">(.*?)\<\/div>/s'

<强>输出：

THE TEXT