Selenium Python浏览上下文已被丢弃。使用Firefox和GeckoDriver

时间:2019-10-25 10:01:21

标签: python-3.x selenium geckodriver

我正在将Firefox GeckoDriver与Selenium结合使用,以在网站上为客户端下载一些文件。

该设置可在带有Docker的Digital Ocean上运行。

这是流程,因此,每当用户调用API时,都会创建一个新的浏览器实例,并使用用户的登录ID和密码登录到网站,然后下载一堆文件,创建一个zip并发回。

当只有一个请求意味着服务器上只有一个浏览器实例时,一切似乎都可以正常工作,但是当有多个请求意味着服务器上有多个浏览器实例时,所有请求都会中断,并显示错误消息“浏览上下文已被丢弃”。

这发生在爬网部分或实例创建之后。

此错误没有特定的模式,它随机发生并破坏浏览器实例。我已经遍历了有关该主题的所有问题和GitHub问题,但是其中有些是过时的解决方法,从一开始就无法在当前版本中使用,而有些则根本无法使用。

这是我在Jenkins上运行的浏览器版本和配置。

`
{'browserName': 'firefox', 
'marionette': True, 
'acceptInsecureCerts': True, 
'moz:firefoxOptions': {
    'prefs': {
               'browser.download.folderList': 2, 
               'browser.download.dir': '/home/usr/usr/project/static/785fg7', 
               'browser.download.useDownloadDir': True, 
               'pdfjs.disabled': True, 
               'browser.helperApps.neverAsk.saveToDisk':
                                  'application/vnd.openxmlformats- 
                                   officedocument.spreadsheetml.sheet,
                                   application/pdf,
                                   application/csv,application/excel,
                                   application/vnd.msexcel,
                                   application/vnd.ms-excel,text/anytext,
                                   text/comma-separated-values,
                                   text/csv,application/vnd.ms-excel,
                                   application/octet-stream,
                                   image/tiff'}, 
           'args': ['-headless', 
                    '--no-sandbox', 
                    '--disable-setuid-sandbox', 
                    '--disable-dev-shm-usage', 
                    '--window-size=1920,1080', 
                    '--start-maximized']}} `

请注意,无论我创建了多少个浏览器实例,都可以在本地正常运行。问题仅在于将其部署在服务器上时。在我的本地系统中,无头模式可以处理任何数量的请求,一切正常。

这是启动浏览器的Python代码。

def get_firefox_driver_for_linux_server(apply_proxy, uuid_user, download_options=False):

    firefox_options = Options()

    firefox_options.set_headless()

    if download_options:
        if not os.path.exists(constants.DOWNLOADS_PATH):
            os.mkdir(constants.DOWNLOADS_PATH)

        download_path = os.path.join(constants.DOWNLOADS_PATH, uuid_user)
        firefox_options.set_preference("browser.download.folderList", 2)
        firefox_options.set_preference("browser.download.dir", download_path)
        firefox_options.set_preference("browser.download.useDownloadDir", True)
        firefox_options.set_preference("pdfjs.disabled", True)
            firefox_options.set_preference("browser.helperApps.neverAsk.saveToDisk",
                                       "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet,"
                                       "application/pdf,"
                                       "application/csv,"
                                       "application/excel,"
                                       "application/vnd.msexcel,"
                                       "application/vnd.ms-excel,"
                                       "text/anytext,"
                                       "text/comma-separated-values,"
                                       "text/csv,"
                                       "application/vnd.ms-excel,"
                                       "application/octet-stream,"
                                       "image/tiff")

    firefox_options.add_argument("--no-sandbox")
    firefox_options.add_argument("--disable-setuid-sandbox")
    firefox_options.add_argument('--disable-dev-shm-usage')
    firefox_options.add_argument("--window-size=1920,1080")
    firefox_options.add_argument("--start-maximized")

    if not os.path.exists(constants.LOG_PATH):
        os.mkdir(constants.LOG_PATH)


    import random as r
    global random_id
    random_id = str(r.randint(1, 99999))
    logging.warning("random id...{}".format(random_id))
    with open(os.path.join(constants.LOG_PATH, random_id + '.log'), 'w+') as lf:
        pass

    gecko_driver_path = "/usr/local/bin/geckodriver"
    if apply_proxy:
        proxy = "proxy:24000"
        firefox_capabilities = webdriver.DesiredCapabilities.FIREFOX
        firefox_capabilities['marionette'] = True
        firefox_capabilities['proxy'] = {
            "proxyType": "MANUAL",
            "httpProxy": proxy,
            "ftpProxy": proxy,
            "sslProxy": proxy
        }

        driver = webdriver.Firefox(executable_path=gecko_driver_path, firefox_options=firefox_options,
                                   capabilities=firefox_capabilities,
                                   log_path=os.path.join(constants.LOG_PATH, random_id + '.log'))
        check_gecko_version(driver, firefox_options)
        return driver
    else:
        logging.info("No proxy applied")
        driver = webdriver.Firefox(executable_path=gecko_driver_path, firefox_options=firefox_options)
        check_gecko_version(driver, firefox_options)
        return driver

0 个答案:

没有答案