使用python和xpath

时间:2015-08-11 11:27:27

标签: python xpath web-scraping lxml

我正在尝试从以下网站获取数据:

http://mozo.com.au/credit-cards/search#fetch/680

使用chrome的'检查元素功能'我已经能够找到我想要的元素地址:

//*[@id="p-40"]/div[4]/table/tbody/tr/td[1]/text()

我希望使用此代码,我可以获得文本“9.99%”

import requests
page = requests.get('http://mozo.com.au/credit-cards/search#fetch/680')
tree = html.fromstring(page.text)


tree.xpath('//*[@id="p-40"]/div[4]/table/tbody/tr/td[1]/text()')

但是,输出是一个空数组。我哪里错了?

1 个答案:

答案 0 :(得分:4)

tobifasc所述,页面是动态加载的。以硒为例,

首先安装:

pip3 install selenium

然后:

import lxml.html
from selenium import webdriver
driver = webdriver.Firefox()
driver.get(url)

tree = lxml.html.fromstring(driver.page_source)

现在您可以查询:

# With your xpath there are 2 results...
results = tree.xpath('//*[@id="p-40"]/div[4]/table/tbody/tr/td[1]/text()')   
results[1].strip()
'9.99%'