Question

我正在使用Python 2.7脚本从网站中提取日期。代码如下：

from lxml import html, etree
from urllib2 import urlopen
import requests

url = 'http://www.cardiffdevils.com/fixtures/'
newtree = etree.HTML(urlopen(url).read())

for section in newtree.xpath('//div[@class="month"]'):
    print section.xpath('h3[1]/text()')
    print section.xpath('//td[@class="date"]/text()')

正确输出月份，但我试图将每个部分的打印日期限制为仅在相应的“月”类中找到的日期;此刻它会吐出整个页面中找到的所有日期。任何指针都将不胜感激！

Answer 1

使用句点（.）启动XPath，使其相对于context元素：

print section.xpath('.//td[@class="date"]/text()')

最后一个问题我回答了这个问题（不同的语言，不同的XPath处理器）：Foreach not iterating through elements

lxml / xpath - 限制输出

1 个答案: