Question

我有一个包含许多行的xml文件：

public function __construct() { $this->middleware('guest'); $this->redirectTo = str_replace(url('/'), '', url()->previous(); }

如何仅提取链接 - <xhtml:link vip="true" href="http://store.vcenter.com/stores/en/product/tigers-midi/100" />？

我尝试了http://store.vcenter.com/stores/en/product/tigers-midi/100，但它捕获了所有内容，直到行尾 - 包括引号和关闭XML标记。

我在egrep中使用这个表达式。

Answer 1

不要使用regex解析HTML，请使用正确的XML / HTML解析器。

检查：Using regular expressions with HTML tags 您可以使用以下其中一项：

文件：

<root>
<xhtml:link vip="true" href="http://store.vcenter.com/stores/en/product/tigers-midi/100" />
</root>

xmllint的示例：

xmllint --xpath '//*[@vip="true"]/@href' file.xml 2>/dev/null

输出：

 href="http://store.vcenter.com/stores/en/product/tigers-midi/100"

如果你需要一个快速的＆amp;脏一次命令，你可以这样做：

egrep -o 'https?://[^"]+' file