我有一个包含urls-pdfs的列表,我想将它们本地保存到我的电脑中

时间:2019-01-29 13:19:39

标签: python pdf download urllib

我是python的新手,所以我有特定的json并且我已经提取了 从字典中获取documentElement的值,然后将其作为列表...如何自动将此pdf下载到目录文件夹?

import urllib.request
import requests
import json

url = 'https://diavgeia.gov.gr/luminapi/api/search/export?q=decisionType:%22%CE%93%CE%9D%CE%A9%CE%9C%CE%9F%CE%94%CE%9F%CE%A4%CE%97%CE%A3%CE%97%22&OrganizationUid:%2250024%22&status:%22%CE%91%CE%BD%CE%B1%CF%81%CF%84%CE%B7%CE%BC%CE%AD%CE%BD%CE%B7%22&page=1&size=4&wt=json'


#get urls
response = requests.get(url)
with urllib.request.urlopen(url) as u:
    data = json.loads(u.read().decode())
#add links to the list
pdf_links = list()
for key in data:
    for x in data[key]:
       pdf_links.append(x['documentUrl'])
#print
print(pdf_links)

1 个答案:

答案 0 :(得分:0)

我们在这里:

import requests
response = requests.get('https://diavgeia.gov.gr/luminapi/api/search/export?q=decisionType:%22%CE%93%CE%9D%CE%A9%CE%9C%CE%9F%CE%94%CE%9F%CE%A4%CE%97%CE%A3%CE%97%22&OrganizationUid:%2250024%22&status:%22%CE%91%CE%BD%CE%B1%CF%81%CF%84%CE%B7%CE%BC%CE%AD%CE%BD%CE%B7%22&page=1&size=4&wt=json')
for doc in response.json()['decisionResultList']:
    r = requests.get(doc['documentUrl'], stream=True)
    with open('{}.pdf'.format(doc['ada']), 'wb') as f:
        for chunk in r:
            f.write(chunk)

以下文件已下载到我的PC:

enter image description here