Question

我有两页申请：
的 /登录
的 /简档
我想得到.har文件页 / profile 。
当我转到 / login 页面时，使用key = connect.sid和value =“example value”创建cookie。此cookie尚未激活。我用活动的connect.sid添加了cookie。

WebDriver webDriver = getDriver();
webDriver.get(LOGIN_PAGE);
webDriver.manage().addCookie(connectsSId);

它不起作用，因为在加载页面之后，/ login创建了一个新的cookie。我也试过这段代码：

WebDriver webDriver = getDriver();
webDriver.get(PROFILE_PAGE);
webDriver.manage().deleteAllCookies();
webDriver.manage().addCookie(connectsSId);

这不起作用。饼干被添加但似乎为时已晚。

 WebDriver webDriver = getDriver();
 LoginPage loginPage = new LoginPage(getDriver());
 LandingPage landingPage = loginPage.login();
 landingPage.openProfilePage();

此代码为页面 / login 创建了一个.har文件。
由于某种原因，只有在第一次调用页面后才会创建文件。我无法解决这个问题。

Answer 1

您可以使用browsermob代理捕获所有请求和响应数据 See here

Answer 2

将PhantomJS与BrowserMobProxy一起使用。 PhantomJS帮助我们实现JavaScript页面。以下代码也适用于HTTPS Web地址。

将'phantomjs.exe'放入C盘，您就可以在C盘中获得'HAR-Information.har'文件。

确保不要在网址末尾添加'/'，例如

driver.get("https://www.google.co.in/")

应该是

driver.get("https://www.google.co.in");

否则，它将无效。

package makemyhar;
import java.io.FileOutputStream;
import java.io.IOException;
import java.util.ArrayList;
import net.lightbody.bmp.BrowserMobProxy;
import net.lightbody.bmp.BrowserMobProxyServer;
import net.lightbody.bmp.core.har.Har;
import net.lightbody.bmp.proxy.CaptureType;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.phantomjs.PhantomJSDriver;
import org.openqa.selenium.phantomjs.PhantomJSDriverService;
import org.openqa.selenium.remote.CapabilityType;
import org.openqa.selenium.remote.DesiredCapabilities;

public class MakeMyHAR {
    public static void main(String[] args) throws IOException, InterruptedException {

        //BrowserMobProxy
        BrowserMobProxy server = new BrowserMobProxyServer();
        server.start(0);
        server.setHarCaptureTypes(CaptureType.getAllContentCaptureTypes());
        server.enableHarCaptureTypes(CaptureType.REQUEST_CONTENT, CaptureType.RESPONSE_CONTENT);
        server.newHar("Google");

        //PHANTOMJS_CLI_ARGS
        ArrayList<String> cliArgsCap = new ArrayList<>();
        cliArgsCap.add("--proxy=localhost:"+server.getPort());
        cliArgsCap.add("--ignore-ssl-errors=yes");

        //DesiredCapabilities
        DesiredCapabilities capabilities = new DesiredCapabilities();
        capabilities.setCapability(CapabilityType.ACCEPT_SSL_CERTS, true);
        capabilities.setCapability(CapabilityType.SUPPORTS_JAVASCRIPT, true);
        capabilities.setCapability(PhantomJSDriverService.PHANTOMJS_CLI_ARGS, cliArgsCap);
        capabilities.setCapability(PhantomJSDriverService.PHANTOMJS_EXECUTABLE_PATH_PROPERTY,"C:\\phantomjs.exe");

        //WebDriver
        WebDriver driver = new PhantomJSDriver(capabilities);
        driver.get("https://www.google.co.in");

        //HAR
        Har har = server.getHar();
        FileOutputStream fos = new FileOutputStream("C:\\HAR-Information.har");
        har.writeTo(fos);
        server.stop();
        driver.close();
    }
}

Answer 3

在Selenium代码中设置首选项：

    profile.setPreference("devtools.netmonitor.har.enableAutoExportToFile", true);
profile.setPreference("devtools.netmonitor.har.defaultLogDir", String.valueOf(dir));
profile.setPreference("devtools.netmonitor.har.defaultFileName", "network-log-file-%Y-%m-%d-%H-%M-%S");

并打开控制台：

Actions keyAction = new Actions(driver);
keyAction.keyDown(Keys.LEFT_CONTROL).keyDown(Keys.LEFT_SHIFT).sendKeys("q").keyUp(Keys.LEFT_CONTROL).keyUp(Keys.LEFT_SHIFT).perform();

Answer 4

我也尝试使用诸如browsermob代理之类的代理来获取har文件

我做了很多研究，因为我收到的文件始终为空。

我所做的是启用浏览器性能日志。

请注意，这仅适用于chrome驱动程序。

这是我的驱动程序类（在python中）

up

在输出中发现的信息量巨大，因此您必须过滤原始数据并获取网络接收和仅发送对象。

from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
from selenium import webdriver
from lib.config import config


class Driver:

    global performance_log
    capabilities = DesiredCapabilities.CHROME
    capabilities['loggingPrefs'] = {'performance': 'ALL'}

    chrome_options = webdriver.ChromeOptions()
    chrome_options.add_argument('--no-sandbox')
    chrome_options.add_argument('--disable-dev-shm-usage')
    chrome_options.add_argument("--headless")
    mobile_emulation = {"deviceName": "Nexus 5"}

    if config.Env().is_mobile():
        chrome_options.add_experimental_option(
            "mobileEmulation", mobile_emulation)
    else:
        pass

    chrome_options.add_experimental_option(
        'perfLoggingPrefs', {"enablePage": True})

    def __init__(self):
        self.instance = webdriver.Chrome(
            executable_path='/usr/local/bin/chromedriver', options=self.chrome_options)

    def navigate(self, url):
        if isinstance(url, str):
            self.instance.get(url)
            self.performance_log = self.instance.get_log('performance')
        else:
            raise TypeError("URL must be a string.")

我们选择将这些数据推送到mongo db中，稍后由etl对其进行分析，并推送到redshift数据库中以创建统计信息。

我希望你在寻找什么。

我运行脚本的方式是：

import json
import secrets


def digest_log_data(performance_log):
    # write all raw data in a file
    with open('data.json', 'w', encoding='utf-8') as outfile:
        json.dump(performance_log, outfile)
    # open the file and real it with encoding='utf-8'
    with open('data.json', encoding='utf-8') as data_file:
        data = json.loads(data_file.read())
        return data


def digest_raw_data(data, mongo_object={}):
    for idx, val in enumerate(data):
        data_object = json.loads(data[idx]['message'])
        if (data_object['message']['method'] == 'Network.responseReceived') or (data_object['message']['method'] == 'Network.requestWillBeSent'):
            mongo_object[secrets.token_hex(30)] = data_object
        else:
            pass

我的主要货源是我已根据自己的需要对其进行了调整。 https://www.reddit.com/r/Python/comments/97m9iq/headless_browsers_export_to_har/ 谢谢

Selenium获取.har文件

4 个答案: