我正在尝试下载PDF文件。该文件在浏览器的PDF查看器中打开。仅在会话处于活动状态时可访问。这是我正在尝试的。
def download_pdf(pdf_url):
local_filename = "myfile.pdf"
# URL to login
login_url = 'https://esaj.tjsp.jus.br/sajcas/login?service=https%3A%2F%2Fesaj.tjsp.jus.br%2Fesaj%2Fj_spring_cas_security_check'
s = requests.Session()
s.auth = ('33945867894', '******')
# logs in successfully Status Code = 200 OK
s.post(login_url, verify=False)
# Accessing PDF [URL][1] ( getting with selenium)
# Fails to get PDF, Staus Code = 401 ( Unauthorized)
r = s.get(pdf_url, stream=True, verify=False)
with open(local_filename, 'wb') as f:
for chunk in r.iter_content(chunk_size=1024):
if chunk: # filter out keep-alive new chunks
f.write(chunk)
# f.flush()
return local_filename
由于未经授权的访问,我无法下载PDF。
修改 查找下载链接的代码位于pastebin上。