我是python和xpath的新手, 我有一个像这样的HTML代码:
<a name="hello"></a>
<h3>hello</h3>
<table />
<a name="impact"></a>
<h3>Impact</h3>
<table cellspacing="0" cellpadding="0" border="0" class="wrapper-table"><tr> <td><p>An unauthenticated attacker using a specifically crafted payload may be able to trick the Ruby on Rails backend into executing arbitrary code.</p></td></tr></table>
我希望用字符串中的所有标签和文本以及...保存整个表格。 我想要影响标题之后的表标记。
答案 0 :(得分:0)
使用
tables = root.xpath('.//table[preceding-sibling::h3[text()="Impact"]]')
或
tables = root.xpath('.//h3[text()="Impact"]/following-sibling::table')
或
tables = root.cssselect('h3:contains(Impact) ~ table')
root = tree.getroot()
tables = root.xpath('.//h3[text()="Impact"]/following-sibling::table')
for table in tables:
print str