我的理解错了吗?
import requests
from lxml import etree
url = 'http://www.forexfactory.com/calendar.php?'
date = {'day':'feb9.2017'}
resp = requests.get(url,date)
tree = etree.HTML(resp.text)
dataId = tree.xpath("string(//*[@id='flexBox_flex_calendar_mainCal']//tr[contains(@class,'calendar__row calendar_row')])")
答案 0 :(得分:2)
XPath周围的string()
函数正在接收第一个elemnet的文本。如果删除它,则可以获取查询元素的集合。从那里,您可以迭代元素并访问元素attrib
property上的data-eventid
属性:
tree = etree.HTML(resp.text)
for row in tree.xpath("//*[@id='flexBox_flex_calendar_mainCal']//tr[contains(@class,'calendar__row calendar_row')]"):
print(row.attrib['data-eventid'])
此外,由于您始终访问元素的data-eventid
属性,因此通过向您的XPath添加data-eventid
来选择具有[@data-eventid]
属性的元素可能更安全:
tree = etree.HTML(resp.text)
for row in tree.xpath("//tr[contains(@class,'calendar__row calendar_row')][@data-eventid]"):
print(row.attrib['data-eventid'])