Nobu Palo Alto在这里说,我想使用硒镀铬网络驱动程序在Google地图上的餐厅中浏览一些评论:
https://www.google.com/maps/place/Nobu+Palo+Alto/@37.4437223,-122.1637038,17z/data=!3m1!4b1!4m11!1m3!2m2!1srestaurants!6e5!3m6!1s0x0:0x5bb11772add3928!8m2!3d37.4437179!4d-122.1615154!9m1!1b1
我使用了这个函数,该函数似乎可以获取(并打印)javascript的高度,但它不是无限滚动,而是在打印最后一个高度==新高度后才中断,但是我知道还有更多评论未加载:>
def __init__(self, site):
self.site=site
self.option = webdriver.ChromeOptions()
self.option.add_argument("-incognito")
self.browser = webdriver.Chrome(executable_path="C:/Users/me/Documents/project/chromedriver.exe",chrome_options=self.option)
def scroll(self):
self.browser.get(self.site)
SCROLL_PAUSE_TIME = 4
sleep(SCROLL_PAUSE_TIME)
# Get scroll height
last_height = self.browser.execute_script("return document.querySelector('#pane > div > div.widget-pane-content.scrollable-y > div > div > div.section-listbox.section-scrollbox.scrollable-y.scrollable-show').scrollHeight")
print("last height = " + str(last_height))
while True:
# Scroll down to bottom
self.browser.execute_script("window.scrollTo(0, document.querySelector('#pane > div > div.widget-pane-content.scrollable-y > div > div > div.section-listbox.section-scrollbox.scrollable-y.scrollable-show').scrollHeight);")
# Wait to load page
sleep(SCROLL_PAUSE_TIME)
# Calculate new scroll height and compare with last scroll height
new_height = self.browser.execute_script("return document.querySelector('#pane > div > div.widget-pane-content.scrollable-y > div > div > div.section-listbox.section-scrollbox.scrollable-y.scrollable-show').scrollHeight")
print("new height = " + str(new_height))
if new_height == last_height:
break
last_height = new_height
答案 0 :(得分:2)
您不必使用javascript滚动浏览即可加载评论。
这是加载所需评论数量的简单脚本。
reviewCount = len(driver.find_elements_by_xpath("//div[@class='section-review ripple-container']"))
# loading a minimum of 50 reviews
while reviewCount <50: #<=== change this number based on your requirement
# load the reviews
driver.find_element_by_xpath("//div[contains(@class,'section-loading-spinner')]").location_once_scrolled_into_view
# wait for loading the reviews
WebDriverWait(driver,10).until(EC.presence_of_element_located((By.XPATH,"//div[@class='section-loading-overlay-spinner'][@style='display:none']")))
# get the reviewsCount
reviewCount = len(driver.find_elements_by_xpath("//div[@class='section-review ripple-container']"))