我想导出所有文档,所以我需要所有链接。
如果鼠标不向下滚动,则不会加载所有链接。
需要向下移动一点以逐渐加载
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import time
# Configuration information
email = "187069474@qq.com"
password = "Huangbo1019@"
driver = webdriver.Chrome()
index_url = "https://testselenium.quip.com/BCWAOAUkg1v"
driver.get(url=index_url)
driver.find_element_by_xpath('//*[@id="header-nav-collapse"]/ul/li[9]/a').click() # click login
time.sleep(1)
driver.find_element_by_xpath('/html/body/div[2]/div[1]/div[1]/form/div/input').send_keys(email) # input email
driver.find_element_by_xpath('//*[@id="email-submit"]').click()
time.sleep(1)
driver.find_element_by_xpath('/html/body/div/div/form/div/input[2]').send_keys(password) # input password
driver.find_element_by_xpath('/html/body/div/div/form/button').click()
time.sleep(2)
可能需要新的策略
答案 0 :(得分:0)
您可以直接尝试使用请求库
links = browser.find_elements_by_tag_name('a')
for link in links:
try:
requests.get(link.get_attribute('href'))
答案 1 :(得分:0)
您可以使用Javascript按坐标滚动。如果可以以像素为单位获取页面的高度,则可以调用Scroll(int xCoordinate, int yCoordinate)
方法并在每次连续滚动时增加滚动坐标。
首先要滚动到页面上的初始x,y坐标。向下滚动时,要滚动到的y坐标将改变。如果在向下滚动时不增加y坐标,则将一遍又一遍滚动到同一位置。因此,您应该根据所使用页面的像素高度确定increment
变量。根据您想一次滚动多少行,您的increment
变量应大约等于您滚动经过的每个“行”的像素高度。
numberOfScrolls
应该等于您要滚动的行数。
方法看起来像这样:
public void ScrollDown(int startX, int startY, int increment, int numberOfScrolls)
{
// cast webdriver to javascript executor
var executor = ((IJavaScriptExecutor) driver);
// keep track of current y coordinate
// since we are scrolling down, y coordinate will change with each scroll
var currentYCoordinate = startY;
// scroll down in a loop
for (int i = 0; i < numberOfScrolls; i++)
{
// scroll down
executor.ExecuteScript("window.scrollTo(startX, currentYCoordinate);");
// todo: implement any FindElement methods to get the new elements which have been scrolled into view.
// update y coordinate since we just scrolled down
currentYCoordinate = currentYCoordinate + increment;
}
}