mac上python中的内存错误

时间:2014-01-18 10:57:12

标签: python dictionary

我有大约10000个文件,包含大量数据。

我试图为所有文件和每个文件中的一些数据构建一个python dict。

我做的是这样的:

results = {}
for bfile in os.listdir(files_dir):
      fname, ext = os.path.splitext(bfile)

      fhandle = open(os.path.join(files_dir,bfile), 'r' )
      if not results.has_key(fname):
                results[fname] = {}
      for line in fhandle:
          line = line.split("\t")


          if not results[fname].has_key(line[0]):
                 results[fname][line[0]] = {}

          if not results[fname][line[0]].has_key(line[1]):
                 results[fname][line[0]][line[1]] = {}

这应该是一项微不足道的任务,但我收到了这个错误:

  File "script.py", line 409, in <module>
    file_handle()
  File "script.py", line 247, in file_handle
    results[fname][line[0]][line[1]] = {}
MemoryError
Error in sys.excepthook:
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/apport_python_hook.py", line 66, in apport_excepthook
    from apport.fileutils import likely_packaged, get_recent_crashes
  File "/usr/lib/python2.7/dist-packages/apport/__init__.py", line 1, in <module>
    from apport.report import Report
  File "/usr/lib/python2.7/dist-packages/apport/report.py", line 18, in <module>
    import problem_report
  File "/usr/lib/python2.7/dist-packages/problem_report.py", line 14, in <module>
    import zlib, base64, time, sys, gzip, struct, os
  File "/usr/lib/python2.7/gzip.py", line 10, in <module>
    import io
  File "/usr/lib/python2.7/io.py", line 60, in <module>
    import _io
MemoryError

Original exception was:
Traceback (most recent call last):
  File "script.py", line 409, in <module>
    file_handle()
  File "script.py", line 247, in file_handle
    results[fname][line[0]][line[1]] = {}
MemoryError
Segmentation fault (core dumped)

1 个答案:

答案 0 :(得分:0)

您完成后似乎永远不会关闭文件。这个可能是问题所在,请尝试以下方法:

with open(os.path.join(files_dir, bfile), 'r') as fhandle: 
    if not results.has_key(fname):
        results[fname] = {}
    for line in fhandle:
        line = line.split("\t")

        if not results[fname].has_key(line[0]):
            results[fname][line[0]] = {}

        if not results[fname][line[0]].has_key(line[1]):
            results[fname][line[0]][line[1]] = {}