如何在python中管理内存错误?

时间:2014-02-11 12:08:25

标签: python file python-2.7

这是我的代码,计算频率

import collections
import codecs
import io
from collections import Counter
with io.open('Combine.txt', 'r', encoding='utf8') as infh:
    words =infh.read().split()
    with open('Counts2.txt', 'wb') as f:
        for word, count in Counter(words).most_common(100000000):
            f.write(u'{} {}\n'.format(word, count).encode('utf-8')) 

当我尝试读取大文件(4 GB)时,我收到错误

Traceback (most recent call last):
  File "counter.py", line 7, in <module>
    words =infh.read().split()
  File "/usr/lib/python2.7/codecs.py", line 296, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
MemoryError

我使用的是Ubuntu 12.4,8 GB RAM Intel Core i7 如何解决这个错误? /

usr/lib/python2.7/codecs.py", line 296, in decode
        (result, consumed) = self._buffer_decode(data, self.errors, final)
    MemoryError

2 个答案:

答案 0 :(得分:1)

这是逐行处理文件的pythonic方法:

with open(...) as fh:
    for line in fh:
        pass

这将负责打开和关闭文件,包括是否在内部块中引发异常,并且它将文件对象fh视为可迭代,它自动使用缓冲的I / O并管理内存所以你不必担心大文件。

答案 1 :(得分:-1)

readline而不是read()

怎么样?

http://docs.python.org/2/tutorial/inputoutput.html