我想从这个HTML代码中获取一些具体信息:
<div class="main">
<div class="a"><div><a>linkname1</a></div></div> <!-- I DON'T want get the text of this 'a' tag -->
<div class="b">xxx</div>
<div class="c">xxx</div>
<div class="a"><div><a>linkname2</a></div></div> <!-- I want get the text of this 'a' tag -->
<div class="a"><div><a>linkname3</a></div></div> <!-- I want get the text of this 'a' tag -->
<div class="a"><div><a>linkname4</a></div></div> <!-- I want get the text of this 'a' tag -->
<div class="a"><div><a>linkname5</a></div></div> <!-- I want get the text of this 'a' tag -->
<div class="d"></div>
<div class="c">xxx</div>
<div class="a"><div><a>linkname6</a></div></div> <!-- I DON'T want get the text of this 'a' tag -->
<div class="a"><div><a>linkname7</a></div></div> <!-- I DON'T want get the text of this 'a' tag -->
<div class="a"><div><a>linkname8</a></div></div> <!-- I DON'T want get the text of this 'a' tag -->
<div class="d"></div>
<div class="c">xxx</div>
<div class="a"><div><a>linkname9</a></div></div> <!-- I DON'T want get the text of this 'a' tag -->
<div class="a"><div><a>linkname10</a></div></div> <!-- I DON'T want get the text of this 'a' tag -->
</div>
我想在数组中获取'second''a'(class)标记块中链接文本的列表(在第一个div与类'c'之间,第二个div在类'c'之间) 。我怎么能通过xpath选择器做到这一点?可能吗 ?我找不到怎么做..
以我的例子为例,预期的结果是:
linkname2
linkname3
linkname4
linkname5
谢谢:)
答案 0 :(得分:2)
您的问题是 Set 问题,如本答案中所述:[{3}}。
因此,应用于您的特定情况,您应该使用交集,如下所示:
(: intersection :)
$set1[count(. | $set2) = count($set2)]
set1 应该是div[@class='c']
和
之后的跟随集
set2 应该是div[@class='d']
之前的前一组。
现在,按照上面的公式将两者放在一起
set1 = "div[@class='c'][1]/following-sibling::*" and
set2 = "div[@class='d'][1]/preceding-sibling::*"
XPath表达式可能如下所示:
div[@class='c'][1]/following-sibling::*[count(. | current()/div[@class='d'][1]/preceding-sibling::*) = count(current()/div[@class='d'][1]/preceding-sibling::*)]
输出
linkname2
linkname3
linkname4
linkname5
答案 1 :(得分:0)
你可以尝试这个表达式:
/div/div[position() > 3 and position() < 8]/div/a/text()
答案 2 :(得分:0)
我找到了一个可能的解决方案:)
//following::div[@class='a' and count(preceding::div[@class="c"]) = 1]/div/a/text()