lxml:cssselect():AttributeError:' lxml.etree._Element'对象没有属性' cssselect'

时间:2015-08-28 06:21:21

标签: python css-selectors lxml

有人可以解释为什么第一次调用crossorigin有效,而第二次调用失败?

root.cssselect()

我明白了:

from lxml.html import fromstring
from lxml import etree

html='<html><a href="http://example.com">example</a></html'
root = fromstring(html)
print 'via fromstring', repr(root) # via fromstring <Element html at 0x...>
print root.cssselect("a")

root2 = etree.HTML(html)
print 'via etree.HTML()', repr(root2) # via etree.HTML() <Element html at 0x...>
root2.cssselect("a") # --> Exception

版本:Traceback (most recent call last): File "/home/foo_eins_d/src/foo.py", line 11, in <module> root2.cssselect("a") AttributeError: 'lxml.etree._Element' object has no attribute 'cssselect'

1 个答案:

答案 0 :(得分:2)

区别在于元素的类型。示例 -

In [12]: root = etree.HTML(html)

In [13]: root = fromstring(html)

In [14]: root2 = etree.HTML(html)

In [15]: type(root)
Out[15]: lxml.html.HtmlElement

In [16]: type(root2)
Out[16]: lxml.etree._Element

lxml.html.HTMLElement的方法为cssselect()。此外,HTMLElementetree._Element的子类。

但是lxml.etree._Element没有那种方法。

如果要解析html,则应使用lxml.html