我正在使用xpath和python尝试从代码中的站点获取数据。我已经设法下载了大部分数据(经过一段时间)但我无法提取Greyhound数据字段和Dogdetail Greyhound数据实际上是一个标签href路径,在尝试xpath上的各种变化之后我仍然无法获得数据。整体计划是下载赛狗结果,进入数据库(或电子表格)任何帮助表示赞赏。
from lxml import html
import requests
page = requests.get('http://www.gbgb.org.uk/resultsRace.aspx?id=1838526')
tree = html.fromstring(page.content)
track=tree.xpath('//div[@class="track"]/text() ')
print 'Track',track
date=tree.xpath('//div[@class="date"]/text() ')
print 'date',date
datetime=tree.xpath('//div[@class="datetime"]/text() ')
print 'datetime', datetime
essentialgreyhound=tree.xpath('//a[@href="essential greyhound"]/text() ')
print 'Greyhound', essentialgreyhound
firstessentialfin= tree.xpath('//li[@class="first essential fin"]//text()')
print 'Position:', firstessentialfin
sp= tree.xpath('//li[@class="sp"]/text() ')
print 'StartingPrice:', sp
trap= tree.xpath('//li[@class="trap"]/text() ')
print 'Trap:', trap
trainer= tree.xpath('//li[@class="essential trainer"]/text() ')
print 'Trainer:', trainer
timeSec=tree.xpath('//li[@class="timeSec"]/text() ')
print 'TimeSec',timeSec
timeDistance=tree.xpath('//li[@class="timeDistance"]/text() ')
print 'TimeDistance',timeDistance
firstessentialcomment=tree.xpath('//li[@class="first essential comment"]/text() ')
print 'Comment',firstessentialcomment
firstessential=tree.xpath('//li[@class="first essential"]/text()')
print 'DogDetail', firstessential
答案 0 :(得分:0)
您应修复Greyhound
列的XPath:
//li[@class="essential greyhound"]/a/text()
给我打印:
Greyhound ['Ultimate Bundle', 'Powerfast Raven', 'Upagumtree', 'Buglys Causeway', 'Group Vespa', 'Winword Jacko']