保存使用Selenium在浏览器中打开的pdf

时间:2018-09-06 20:23:32

标签: python selenium internet-explorer-11

因此,我要登录到我公司拥有的Web应用程序并运行生成pdf的请求,所有这些操作都使用python使用Internet Explorer驱动程序完成。我只能使用IE,因为公司系统无法与其他任何浏览器一起使用。

提交请求后,会弹出一个新的IE窗口,其中包含我请求的pdf文件。我想将pdf文件保存到我的计算机上。我意识到在IE中进行下载并不容易,但是必须有一种方法来实现。我也可以将其另存为png或其他任何格式,但pdf较长(通常跨度为2-5页),因此无法使用打印屏幕或屏幕截图。

关于我能做什么的任何建议?

下面是一个简单的代码片段:

driver.implicitly_wait(5)

driver.find_element_by_name("invNumSrchTxt_H").send_keys("ABCDE")  #sending the parameters I need
driver.find_element_by_name("invDt_B").clear()  # Clearing out some preset params
driver.find_element_by_name("invDt_A").clear()


 # This is where I click the button and this pops open a new IE window with my pdf file in it.
 s=driver.find_element_by_name("Print_Invoice")
 s.click()

1 个答案:

答案 0 :(得分:0)

由于IE不支持设置配置,因此可以使用 requests 直接发送请求。

可能的实现方式可能是:

import requests


def download_pdf_file(url, filename=None, download_dir=None):
    """
    Download pdf file in url,
    save it in download_dir as filename.
    """
    if download_dir is None: # set default download directory
        download_dir = r'C:\Users\{}\Downloads'.format(os.getlogin())

    if filename is None: # set default filename available
        index = 1
        while os.path.isfile(os.path.join(download_dir, f'pdf_{index}')):
            index += 1
        filename = os.path.join(download_dir, f'pdf_{index}')

    response = requests.get(url) # get pdf data
    with open(os.path.join(download_dir, filename), 'wb') as pdf_file:
        pdf_file.write(response.content) # save it in new file


driver.implicitly_wait(5)

driver.find_element_by_name("invNumSrchTxt_H").send_keys("ABCDE")  #sending the parameters I need
driver.find_element_by_name("invDt_B").clear()  # Clearing out some preset params
driver.find_element_by_name("invDt_A").clear()


# This is where I click the button and this pops open a new IE window with my pdf file in it.
s=driver.find_element_by_name("Print_Invoice")
s.click()

driver.download_pdf_file = download_pdf_file

driver.download_pdf_file(driver.current_url, # pdf url of the new tab
                  filename='myfile.pdf', # custom filename
                  download_dir='') # relative path to local directory