如何使用Python或selenium下载PDF?

时间:2017-08-31 02:01:23

标签: python selenium python-requests

我正在尝试下载PDF文件。该文件在浏览器的PDF查看器中打开。仅在会话处于活动状态时可访问。这是我正在尝试的。

def download_pdf(pdf_url):
     local_filename = "myfile.pdf"
     # URL to login
     login_url = 'https://esaj.tjsp.jus.br/sajcas/login?service=https%3A%2F%2Fesaj.tjsp.jus.br%2Fesaj%2Fj_spring_cas_security_check'
     s = requests.Session()
     s.auth = ('33945867894', '******')
     # logs in successfully Status Code = 200 OK
     s.post(login_url, verify=False)
     # Accessing PDF [URL][1] ( getting with selenium) 
     # Fails to get PDF, Staus Code = 401 ( Unauthorized)
     r = s.get(pdf_url,  stream=True, verify=False)
     with open(local_filename, 'wb') as f:
        for chunk in r.iter_content(chunk_size=1024):
            if chunk:  # filter out keep-alive new chunks
                f.write(chunk)
                # f.flush() 
return local_filename

由于未经授权的访问,我无法下载PDF

修改 查找下载链接的代码位于pastebin上。

https://pastebin.com/kskmg4Bu

0 个答案:

没有答案