如何在Python中没有MemoryError下载大文件?

时间:2014-11-21 02:07:10

标签: python out-of-memory

我想以编程方式下载一些文件,但是对于较大的文件会出现MemoryError异常。例如,当我尝试下载small file时,代码很好,但是当我尝试下载larger file时,我抓住了MemoryError

这是我的代码:

def __download_gpl_file(accession):
    try:
        bin_string = __get_response(accession)
        if bin_string is None:
            return False
        string = __unzip(bin_string)
    except MemoryError:
        print 'Out of memory for: ' + accession
        return False

    if string:
        filename = DOWNLOADED + accession + '.txt'
        with open(filename, 'w+') as f:
            f.write(string)
        return True
    return False


def __get_response(attempts=5):
    url = __construct_gpl_url(accession)  # Not shown
    response = None
    while attempts > 0:
        try:
            response = urllib2.urlopen(url)
            if response and response.getcode() < 201:
                break
            else:
                attempts -= 1
        except urllib2.URLError:
            print 'URLError with: ' + url
    return response.read()


def __unzip(bin_string):
    f = StringIO(bin_string)
    decompressed = gzip.GzipFile(fileobj=f)
    return decompressed.read()

我有什么办法可以下载更大的文件吗?提前谢谢。

2 个答案:

答案 0 :(得分:4)

而不是一次写入整个文件,你逐行写:

file = urllib2.urlopen('url')
with open('filename','w') as f:
    for x in file:
        f.write(x)

如果你想让它更快:

file = urllib2.urlopen('url')
with open('filename','w') as f:
    while True:
        tmp = file.read(1024)
        if not tmp:
            break 
        f.write(tmp)

答案 1 :(得分:2)

我没有足够的评论来评论Hackaholic的答案,所以我的答案只是他的第一个例子,但略有修正。

NSString *query = @"SELECT * FROM tableName";

我认为他偶然写了f.write(f)。