由utag发起的连接在浏览器mob代理中不可见(python selenium)

时间:2018-09-04 08:19:57

标签: python selenium headless-browser browsermob-proxy

我正在将python硒与chrome headless和browsermob代理一起使用,以打印出加载页面时启动的所有连接。我对作为utag.js之类的跟踪标签的一部分发起的每个连接特别感兴趣。问题是使用脚本时看不到它们,但是当我手动浏览页面时,却在浏览器的开发人员控制台中看到了它们。我怀疑我缺少可以触发JS的东西,但无法弄清楚是什么。

这是我的脚本:

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
import os
import subprocess
from browsermobproxy import Server

proxy_options = {'port': 8888}
server= Server(path="/home/ubuntu/findanalytics/browsermob-proxy-2.1.4/bin/browsermob-proxy", options=proxy_options)
server.start()
proxy= server.create_proxy()

chrome_options = Options()
chrome_options.add_argument("--headless")
chrome_options.add_argument("--window-size=1920x1080")
chrome_options.add_argument("--proxy-server={0}".format(proxy.proxy))

# next one would be possible via chrome options too but does not seem to work
# it is needed to make HTTPS visible in BMP
desired_capabilities = {"acceptInsecureCerts":True}


chrome_driver = os.getcwd()+"/chromedriver"

driver = webdriver.Chrome(chrome_options=chrome_options,executable_path=chrome_driver,desired_capabilities=desired_capabilities)
proxy.new_har("something")


driver.get("https://www.salt.ch")

for connection in proxy.har['log']['entries']:
    print connection['request']['url']


server.stop()
driver.quit()

有人可以指出一个可能的解决方案吗?我搜索了一段时间,但没有找到任何……也许我只是使用了错误的搜索词(此处仅以salt.ch为例,因为它们使用了utag.js)

更新

如果我通过在driver.get之后添加以下内容来搜索该页面上的内容:

# search something to play with JS
searchstring = "samsung"
searchfield = driver.find_element_by_name("q")
searchfield.send_keys(searchstring)

searchbtn = driver.find_element_by_id("field-search-submit")
searchbtn.click()

# end

然后我看到utag.js如何启动。但是,我想在页面加载时看到它。有想法吗?

“解决方案”

好的,我已经解决了。这很丑陋,我还希望有一个真正的解决方案,但是现在我看到了联系。我将其发布在此处,以防其他人遇到此问题。但是我也希望比我更熟练的人可以提供更好的解决方案。现在,我删除了搜索部分,并在driver.get(丑陋)之后增加了10秒钟的睡眠时间。因此,完整的,可以正常工作的脚本如下所示:

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
import os
import subprocess
from browsermobproxy import Server
import time

PICPATH="/home/ubuntu/screenshots/"

proxy_options = {'port': 8888}
server= Server(path="/home/ubuntu/findanalytics/browsermob-proxy-2.1.4/bin/browsermob-proxy", options=proxy_options)
server.start()
proxy= server.create_proxy()

chrome_options = Options()
chrome_options.add_argument("--headless")
chrome_options.add_argument("--window-size=1920x1080")
chrome_options.add_argument("--proxy-server={0}".format(proxy.proxy))

# next one would be possible via chrome options too but does not seem to work
# it is needed to make HTTPS visible in BMP
desired_capabilities = {"acceptInsecureCerts":True}


chrome_driver = os.getcwd()+"/chromedriver"

driver = webdriver.Chrome(chrome_options=chrome_options,executable_path=chrome_driver,desired_capabilities=desired_capabilities)
proxy.new_har("something")


driver.get("https://www.salt.ch")
time.sleep(10)

for connection in proxy.har['log']['entries']:
    print connection['request']['url']


server.stop()
driver.quit()

0 个答案:

没有答案