我有以下结构:
<html>
<head>
<title>Index of /</title>
</head>
<body>
<h1>Index of /</h1>
<pre>
<img src="/icons/blank.gif" alt="Icon "> <a href="?C=N;O=D">Name</a> <a
href="?C=M;O=A">Last modified</a> <a href="?C=S;O=A">Size</a> <a
href="?C=D;O=A">Description</a>
<hr>
<img src="/icons/folder.gif" alt="[DIR]"> <a href="berta.ear/">berta.ear/</a> 23-Sep-2014 13:17 -
<img src="/icons/folder.gif" alt="[DIR]"> <a href="egon.ear/">egon.ear/</a> 24-Oct-2014 16:04 -
<img src="/icons/folder.gif" alt="[DIR]"> <a href="anton01.ear/">anton01.ear/</a> 18-Dec-2014 12:03 -
<img src="/icons/folder.gif" alt="[DIR]"> <a href="anton02.ear/">anton02.ear/</a> 18-Dec-2014 08:38 -
<img src="/icons/folder.gif" alt="[DIR]"> <a href="anton03.ear/">anton03.ear/</a> 18-Dec-2014 11:43 -
<img src="/icons/folder.gif" alt="[DIR]"> <a href="anton04.ear/">anton04.ear/</a> 05-Dec-2014 16:02 -
<img src="/icons/folder.gif" alt="[DIR]"> <a href="anton05.ear/">anton05.ear/</a> 15-Sep-2014 19:22 -
<img src="/icons/folder.gif" alt="[DIR]"> <a href="anton06.ear/">anton06.ear/</a> 17-Dec-2014 10:50 -
<img src="/icons/folder.gif" alt="[DIR]"> <a href="anton07.ear/">anton07.ear/</a> 10-Dec-2014 13:02 -
<img src="/icons/folder.gif" alt="[DIR]"> <a href="anton08.ear/">anton08.ear/</a> 15-Dec-2014 09:30 -
<img src="/icons/folder.gif" alt="[DIR]"> <a href="anton09.ear/">anton09.ear/</a> 18-Dec-2014 08:47 -
<img src="/icons/folder.gif" alt="[DIR]"> <a href="anton10.ear/">anton10.ear/</a> 18-Dec-2014 11:11 -
....
</pre>
所以现在我正在尝试获取<a href=...
的信息,但仅针对<img src=".." alt="[DIR]">...
所以我创建了一个看起来像这样的XPath:
tester.getElementsByXPath("/html/body/pre/*[self::img[@alt='[DIR]']]");
以上只会给我<img ...>
个元素。但我需要的是<a href=""..>
元素。
任何人都知道我做错了什么?
答案 0 :(得分:1)
您可以使用a
选择器获取下一个following-sibling
兄弟姐妹:
/html/body/pre/*[self::img[@alt='[DIR]']]/following-sibling::a
答案 1 :(得分:1)
整理HTML以便可以将其解析为XML,并假设<img>
标签是自我关闭的(即不包装a&#39; s),这个xpath应该找到任何a,前一个兄弟都是img
,并且具有alt='[DIR]'
属性:
/html//a[(preceding-sibling::img[1])[1][local-name()='img' and @alt='[DIR]']]