Question

我找到了these three potential answers，但他们都使用了HtmlUnit api。如何避免使用HtmlUnit api和only use selenium或某些配置进行浏览器设置？

Answer 1

现在是HtmlUnit 2.25-snapshot的一部分，webClient.getOptions().setDownloadImages(true)。

在功能DOWNLOAD_IMAGES_CAPABILITY或htmlUnitDriver.setDownloadImages(true)的HtmlUnit-Driver 2.25快照中。

Answer 2

据我所知，无法使用HtmlUnit（有或没有Selenium）自动下载所有图像。如您发布的链接所示，您可以强制HtmlUnit使用以下代码下载页面上的所有图像：

DomNodeList<DomElement> imageElements = htmlPage.getElementsByTagName("img");

for (DomElement imageElement : imageElements) {

    HtmlImage htmlImage = (HtmlImage) imageElement;

    try {

        // Download the image.
        htmlImage.getImageReader();
    }
    catch (IOException e) {
        // do nothing.
    }
}

但是，在使用Selenium HtmlUnitDriver时获取当前页面并非易事。有多种方法可以执行此操作，但所有方法都需要访问protected HtmlUnitDriver.lastPage()方法。 One way to access this method is through reflection.另一种解决方案是利用protected方法也可以由同一个包中的类和packages can be the same across jars访问的事实。结合后者的功能/设计缺陷，我能够提出一种避免反射的解决方案。相反，它只是将一个普通类添加到与HtmlUnitDriver --- org.openqa.selenium.htmlunit相同的包中。

package org.openqa.selenium.htmlunit;

import java.io.IOException;

import com.gargoylesoftware.htmlunit.html.DomElement;
import com.gargoylesoftware.htmlunit.html.DomNodeList;
import com.gargoylesoftware.htmlunit.html.HtmlImage;
import com.gargoylesoftware.htmlunit.html.HtmlPage;

public class HtmlUnitUtil {

    private HtmlUnitUtil() {
        throw new AssertionError();
    }

    public static void loadImages(HtmlUnitDriver htmlUnitDriver) {

        // Since we are in the same package (org.openqa.selenium.htmlunit)
        // as HtmlUnitDriver, we can access HtmlUnitDriver's protected
        // lastPage() method.
        HtmlPage htmlPage = (HtmlPage) htmlUnitDriver.lastPage();
        DomNodeList<DomElement> imageElements =
            htmlPage.getElementsByTagName("img");

        for (DomElement imageElement : imageElements) {

            HtmlImage htmlImage = (HtmlImage) imageElement;

            try {

                // Download the image.
                htmlImage.getImageReader();
            }
            catch (IOException e) {
                // do nothing.
            }
        }
    }
}

不幸的是，每次要加载图像时，都需要手动调用此代码。我为HtmlUnitDriver创建了一个功能请求（htmlunit-driver #40），以添加自动下载图像的选项。如果您想查看此功能，请投票支持该问题。

如何让硒驱动htmlunit自动下载图像？

2 个答案: