Python将PDF下载到.zip中

时间:2016-09-21 21:43:08

标签: python pdf zip python-requests

我要做的是遍历URL列表以下载一系列.pdfs,并将它们保存为.zip。目前我只是尝试使用一个URL来测试代码。我得到的错误是:

Traceback (most recent call last):
  File "I:\test_pdf_download_zip.py", line 36, in <module>
    zip_file(zipfile_name, url)
  File "I:\test_pdf_download_zip.py", line 30, in zip_file
    myzip.write(dowload_pdf(url))
TypeError: expected a string or other character buffer object

有人知道如何正确地将.pdf请求传递给.zip(避免上述错误)以便我追加它,或者知道是否可以这样做?

import os
import zipfile
import requests

output = r"I:"

# File name of the zipfile
zipfile_name = os.path.join(output, "test.zip")

# Random test pdf
url = r"http://www.pdf995.com/samples/pdf.pdf"

def create_zipfile(zipfile_name):
    zipfile.ZipFile(zipfile_name, "w")

def dowload_pdf(url):
    response = requests.get(url, stream=True)
    with open('test.pdf', 'wb') as f:
        f.write(response.content)

def zip_file(zip_name, url):
    with open(zip_name,'a') as myzip:
        myzip.write(dowload_pdf(url))

if __name__ == "__main__":
    create_zipfile(zipfile_name)
    zip_file(zipfile_name, url)
    print("Done")

1 个答案:

答案 0 :(得分:0)

您的worker.py功能正在保存文件,但它不会返回任何内容。您需要对其进行修改,以便实际将文件路径返回到download_pdf()。您不想对test.pdf进行硬编码,而是将唯一路径传递给您的下载功能,这样您就不会在归档中使用多个myzip.write()

test.pdf