链接的HTTP选择器(xpath或css)

时间:2017-03-05 06:52:08

标签: python html css xpath

我试图抓住这个网站中每只鞋子的href元素:

http://www.soccerpro.com/Clearance-Soccer-Shoes-c168/

但我无法找到合适的选择器。

response.xpath('.//*[@class="newnav itemnamelink"]')
[]

任何人都知道如何在xpath或css中执行此操作?

1 个答案:

答案 0 :(得分:1)

动态生成所需的链接,因此您无法从HTML获取requests.get("http://www.soccerpro.com/Clearance-Soccer-Shoes-c168/")来源{/ 1}}

您可以使用selenium通过浏览器会话获取所需的值:

from selenium import webdriver as web
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait as wait

driver = web.Chrome()
driver.get('http://www.soccerpro.com/Clearance-Soccer-Shoes-c168/')
wait(driver, 10).until(EC.presence_of_element_located((By.XPATH, "//table[@class='getproductdisplay-innertable']")))
links = [link.get_attribute('href') for link in driver.find_elements_by_xpath('//a[@class="newnav itemnamelink"]')]