Python lxml cssselect第n个匹配

时间:2012-06-13 07:54:19

标签: python css-selectors lxml

说我有一些与此相似的HTML:

<div id="content">
  <span class="green">something</span>
  <span class="blue">something</span>
  <span class="red">something</span>
  <span class="green">something</span>
  <span class="yellow">something</span>
</div>

使用cssselect获取第二个元素的最佳方法是什么? 我总是可以cssselect('span.green')然后从结果中选择第二个元素,但是在一个包含数百个元素的大页面中,我猜它会慢得多。

1 个答案:

答案 0 :(得分:1)

虽然这不是你问题的答案,但这是我这样做的方式:

使用XPath而不是cssselect:

>>> from lxml.etree import tostring
>>> from lxml.html.soupparser import fromstring
>>> x = tostring('<div id="content"><span class="green">something</span><span class="blue">something</span><span class="red">something</span><span class="green">something</span><span class="yellow">something</span></div>')
>>> x.xpath('//span[@class="green"][2]')
[<Element span at b6df71ac>]
>>> x.xpath('//span[@class="green"][2]')[0]
<Element span at b6df71ac>
>>> tostring(x.xpath('//span[@class="green"][2]')[0])
'<span class="green">something</span>'

或者如果您更喜欢Python中的元素列表:

>>> x.xpath('//span[@class="green"]')
[<Element span at b6df71ac>, <Element span at b6df720c>]
>>> tostring(x.xpath('//span[@class="green"]')[1])
'<span class="green">something</span>'