我正在尝试从this页面中提取每个餐馆的网址,并为此编写了一个python脚本:
import time
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
browser = webdriver.Firefox()
browser.get("http://www.delyver.com/Partners/partner/HSR%20Layout,%20Bengaluru,%20Karnataka,%20India/12.9081357/77.64760799999999")
time.sleep(1)
elem = browser.find_element_by_tag_name("body")
no_of_pagedowns = 40
while no_of_pagedowns:
elem.send_keys(Keys.PAGE_DOWN)
time.sleep(0.2)
no_of_pagedowns-=1
post1 = browser.find_elements_by_css_selector("Parwrsp.Parwrsp-Ado")
for post in post1:
print post.get('href')
当我运行脚本时,浏览器窗口打开,我最大化其窗口大小以获得焦点,并自动向下滚动。但没有任何印刷品。我在this链接后实施了selenium。
我做错了什么?
答案 0 :(得分:0)
您当前的CSS选择器与任何元素都不匹配,因为Parwrsp
是一个类。
如果要匹配多个类,请以这种方式编写选择器:
.Parwrsp.Parwrsp-Ado
而且,get()
个实例上没有WebElement
方法,您打算使用get_attribute()
:
posts = browser.find_elements_by_css_selector(".Parwrsp.Parwrsp-Ado")
for post in posts:
print post.get_attribute('href')
证明上述意义:
>>> from selenium import webdriver
>>>
>>> browser = webdriver.Firefox()
>>> browser.get("http://www.delyver.com/Partners/partner/HSR%20Layout,%20Bengaluru,%20Karnataka,%20India/12.9081357/77.64760799999999")
>>> for post in browser.find_elements_by_css_selector(".Parwrsp.Parwrsp-Ado"):
... print post.get_attribute('href')
...
http://www.delyver.com/Partners/partnerdetailsview/947/Purnabramha,-HSR
http://www.delyver.com/Partners/partnerdetailsview/916/Moti-Mahal-Deluxe,-HSR-Layout