Question

我想从content属性中给出的HTML元标记中检索文件网址。

以下是HTML代码示例：

 <meta content="https://www.domain.com/player/player-viral.swf?config=
https://www.domain.com/configxml?id=133291&logo.
link=http://www.domain.org/Amin+Rostami/-/Havam+Toei&
image=https://www.domain.com/img/3lv68bc5w-1396897306.jpeg&provider=audio&
file=http://s10.domain.me/music/A/[one]/test-msusic.mp3" property="og:video"/>

我想获取文件网址，在本例中为http://s10.domain.me/music/A/[one]/test-msusic.mp3

Answer 1

您可以使用substring-after()从file=标记的content属性中提取meta后的链接：

substring-after(//meta/@content, "file=")

演示（使用xmllint）：

$ cat input.xml
<meta content="https://www.domain.com/player/player-viral.swf?config=
https://www.domain.com/configxml?id=133291&amp;logo.
link=http://www.domain.org/Amin+Rostami/-/Havam+Toei&amp;
image=https://www.domain.com/img/3lv68bc5w-1396897306.jpeg&amp;provider=audio&amp;
file=http://s10.domain.me/music/A/[one]/test-msusic.mp3" property="og:video"/>

$ $ xmllint input.xml --xpath 'substring-after(//meta/@content, "file=")'
http://s10.domain.me/music/A/[one]/test-msusic.mp3

使用XPath从属性中选择URL

1 个答案: