在 Java 中使用 Selenium 抓取需要滚动的动态页面

时间:2021-04-29 18:30:35

标签: java selenium web-scraping

我想查找并打印此页面上找到的所有背景名称:

https://store.steampowered.com/points/shop/c/backgrounds/cluster/2

我的程序的问题是我无法滚动到页面底部。我试过了

  1. 操作
  2. 选择一个随机元素以对其执行空格键
  3. 一堆 Javascript 代码(均未在本网站上运行)
  4. Robot 方法(我使用该方法发送一个向下翻页的按键并且它确实有效,但在滚动时我无法使用 PC 执行任何其他操作)

这是我目前的程序:


public static void steamScraper() {
    
    System.setProperty("webdriver.chrome.driver", "C:\\Users\\user\\Documents\\Selenium\\chromedriver_win32\\chromedriver.exe");
    
    ChromeOptions options = new ChromeOptions();
    options.setBinary("C:\\Users\\user\\Downloads\\chrome-win\\chrome.exe");
    WebDriver driver = new ChromeDriver(options);
    
    driver.get("https://store.steampowered.com/points/shop/c/backgrounds/cluster/2");
    
    driver.manage().window().maximize();
    
    // let the page load for some time
    try {
        Thread.sleep(5000);
    } catch (InterruptedException e1) {
        // TODO Auto-generated catch block
        e1.printStackTrace();
    }
    
    JavascriptExecutor js = (JavascriptExecutor) driver;
    
    // i = 2200, because there are ~44000 items and 20 items load each scroll, thus 44000 / 20 = 2200
    
    for (int i = 0; i < 10; i++) {
        
        try {
            Thread.sleep(1000);
        } catch (InterruptedException e1) {
            e1.printStackTrace();
        }
        
        System.out.println(i + 1 + ". " + "scrolling...");
        
    //  here is where I would do the scrolling, and I tried a lot of javascript methods to scroll to the bottom of the page
    //  yet none that I have tried yet would actually perform a scroll (though the method below worked on another site than Steam)
    //
    //  js.executeScript("window.scrollTo(0, document.body.scrollHeight)");

    }
    
    System.out.println("");
    
    // get all div classes containing an item (background)
    
    List<WebElement> nameOfGames = driver.findElements(By.xpath(".//div[@class='rewarditem_AppIconContainer_3Oyyi']//img"));
    
    for (WebElement webElement : nameOfGames) {

        // print the item name (title of img)
        
        System.out.println(webElement.getAttribute("title"));
        
    }
    
    
} ```

1 个答案:

答案 0 :(得分:0)

我想我刚刚想出了解决方案:

我创建了一个新的 List<WebElement> gamesAfterFirstScroll = null; 列表。我通过查找我感兴趣的所有元素在 for 循环中使用此列表,然后我使用 action.moveToElement(gamesAfterFirstScroll.get(lastGameItem))action.perform() 滚动到最后一个项目。这是它的外观:

        List<WebElement> nameOfGames = driver.findElements(By.xpath(".//div[@class='rewarditem_AppIconContainer_3Oyyi']//img"));
        
        Actions action = new Actions(driver);
    
        List<WebElement> gamesAfterFirstScroll = null;
        
        System.out.println("Scrolling.\r\n");
        
        for(int i = 0; i < 50; i++) {

            try {
                Thread.sleep(2000);
            } catch (InterruptedException e1) {
                // TODO Auto-generated catch block
                e1.printStackTrace();
            }
            
            gamesAfterFirstScroll = driver.findElements(By.xpath(".//div[@class='rewarditem_AppIconContainer_3Oyyi']//img"));
                
            int lastGameItem = gamesAfterFirstScroll.size() - 1;
            
            action.moveToElement(gamesAfterFirstScroll.get(lastGameItem));
            action.perform();
            
            System.out.println(i+1 + ". scroll");
            
        }
            
        for (WebElement e : gamesAfterFirstScroll) {
            
            System.out.println(e.getAttribute("title"));
            nameOfGames.add(e);
        }