使用python 3下载csv文件

时间:2015-10-19 00:41:56

标签: python csv web

我是Python新手。这是我的环境设置:

我有Anaconda 3(Python 3)。我希望能够从网站下载CSV文件: https://data.baltimorecity.gov/api/views/dz54-2aru/rows.csv?accessType=DOWNLOAD

我想使用请求库。我非常感谢帮助我们了解如何使用请求库将CSV文件下载到我机器上的本地目录

1 个答案:

答案 0 :(得分:1)

建议将数据下载为流,并将其刷新到目标或中间本地文件中。

import requests


def download_file(url, output_file, compressed=True):
    """
    compressed: enable response compression support
    """
    # NOTE the stream=True parameter. It enable a more optimized and buffer support for data loading.
    headers = {}
    if compressed:
        headers["Accept-Encoding"] = "gzip"

    r = requests.get(url, headers=headers, stream=True)

    with open(output_file, 'wb') as f: #open as block write.
        for chunk in r.iter_content(chunk_size=4096): 
            if chunk: # filter out keep-alive new chunks
                f.write(chunk)
        f.flush() #Afterall, force data flush into output file (optional)

    return output_file

考虑原帖:

remote_csv = "https://data.baltimorecity.gov/api/views/dz54-2aru/rows.csv?accessType=DOWNLOAD"
local_output_file = "test.csv"

download_file(remote_csv, local_output_file)

#Check file content, just for test purposes:
print(open(local_output_file).read())

基本代码是从这篇文章中提取的:https://stackoverflow.com/a/16696317/176765

在这里,您可以通过请求lib获得有关正文流使用情况的更多详细信息:

http://docs.python-requests.org/en/latest/user/advanced/#body-content-workflow