我很遗憾不得不抓一个网页,我是通过Google Docs这样做的。
该文件如下:
<div class='search'>
<div class='new'>
<img src="product1.png" title="Product 1 - €2.40"/>
</div>
<div class='new dupe'> <!-- this one appears dimmed: there's a better offer -->
<!-- I don't want these in my results -->
<img src="product1.png" title="Product 1 - €2.70"/>
</div>
</div>
当前的xPath如下所示:
//div[@class='search']//@title
我该如何修改它?我能做到
//div[@class='search']//div[not(@class='dupe')]//@title
...但这不起作用,因为没有任何项目实际上class
es的列表正好是'dupe'
。
答案 0 :(得分:4)
/div[@class='search']/div[not(contains(@class, 'dupe')]//@title
我会尽量避免使用//
并且更具体:
/div[@class='search']/div[not(contains(@class, 'dupe')]/img/@title