尝试加载大型CSV文件时Python中的MemoryError

时间:2014-03-06 17:10:52

标签: python r csv memory

当我加载带有pandas的大型CSV文件时,我得到以下MemoryError:

Traceback (most recent call last):
  File "/home/k/workspace/loans/src/loans.py", line 100, in <module>
    X_test  =   testdata('test_v2.csv')
  File "/home/k/workspace/loans/src/loans.py", line 18, in testdata
    X   =   pd.read_table(filename, sep=',',    warn_bad_lines=True,    error_bad_lines=True)
  File "/usr/local/lib/python2.7/dist-packages/pandas/io/parsers.py", line 420, in parser_f
    return _read(filepath_or_buffer, kwds)
  File "/usr/local/lib/python2.7/dist-packages/pandas/io/parsers.py", line 225, in _read
    return parser.read()
  File "/usr/local/lib/python2.7/dist-packages/pandas/io/parsers.py", line 626, in read
    ret = self._engine.read(nrows)
  File "/usr/local/lib/python2.7/dist-packages/pandas/io/parsers.py", line 1070, in read
    data = self._reader.read(nrows)
  File "parser.pyx", line 727, in pandas.parser.TextReader.read (pandas/parser.c:6866)
  File "parser.pyx", line 777, in pandas.parser.TextReader._read_low_memory (pandas/parser.c:7452)
  File "parser.pyx", line 1788, in pandas.parser._concatenate_chunks (pandas/parser.c:20462)
MemoryError

文件大小为1 GB。 R打开它没有太多麻烦(这很奇怪,因为如果我理解正确,R比Python更高级别......)

我在Intel(R)Core(TM)i3 CPU 550 @ 3.20GHz上使用4GB RAM运行代码。我在Linux Ubuntu 12.04 32位上运行代码。

有什么技巧可以让它发挥作用吗?

谢谢!

0 个答案:

没有答案