无法使用Selenium进行无限滚动

时间:2019-02-12 10:17:45

标签: python selenium twitter web-scraping beautifulsoup

我使用Selenium刮了过去1年的推文,但无法滚动页面至某点之外并指向“返回首页”。 我如何使用硒来克服这个问题?

这是我的代码-

driver=webdriver.Firefox(executable_path="/home/piyush/geckodriver")
url="https://twitter.com/narendramodi"
driver.get(url)
time.sleep(6)

lastHeight = driver.execute_script("return document.body.scrollHeight")
while True:
    driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
    time.sleep(6)
    newHeight = driver.execute_script("return document.body.scrollHeight")
    if newHeight == lastHeight:
         break
    lastHeight = newHeight

这是图像的输出 Here is the output as image

1 个答案:

答案 0 :(得分:0)

您可以使用类似以下的内容。尝试等待一段时间直到“返回页首”消失,然后继续抓取。

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

driver = webdriver.Firefox()
driver.get("http://somedomain/url_that_delays_loading")
try:
    disappeared = WebDriverWait(driver, 10).until(
        lambda x: not EC.visibility_of_element_located((By.ID, "myDynamicElement"))
    )

    if disappeared:
        print('Continue')
finally:
    driver.quit()