在python中通过xpath提取A标记的值

时间:2014-09-25 11:56:25

标签: python xpath

我有一个简单的python脚本,如:

#!/usr/bin/python
import requests
from lxml import html
response = requests.get('http://site.ir/')
out=response.content
tree = html.fromstring(open(out).read())
print [e.text_content() for e in tree.xpath('//div[class="group"]/div[class="groupinfo"]/a/text()')]

我使用xpath来获取标记a的值,如下图所示... enter image description here 但是输出样本并不是我的预期。

UPDATE 我还有以下错误:

Traceback (most recent call last):
  File "p.py", line 7, in <module>
    tree = html.fromstring(open(out).read())
IOError: [Errno 36] File name too long: '\n<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" ....

1 个答案:

答案 0 :(得分:2)

您需要将@放在属性名称的开头以解决XPath中的属性:

//div[@class="group"]/div[@class="groupinfo"]/a/text()