如何使用lxml.html库解析HTML

时间:2014-03-12 21:39:51

标签: python lxml.html

以下是我网站上显示的HTML

<meta content="auth" name="param" />
<meta content="I_WANT_THIS" name="token" />

如何使用lxml.html来抓住它?

1 个答案:

答案 0 :(得分:2)

使用xpathmeta属性查找name代码,并获取content属性的值:

from lxml.html import fromstring


html_data = """ <meta content="auth" name="param" />
 <meta content="I_WANT_THIS" name="token" />"""

tree = fromstring(html_data)
print tree.xpath('//meta[@name="token"]/@content')

打印:

['I_WANT_THIS']