Question

当我加载带有pandas的大型CSV文件时，我得到以下MemoryError：

Traceback (most recent call last):
  File "/home/k/workspace/loans/src/loans.py", line 100, in <module>
    X_test  =   testdata('test_v2.csv')
  File "/home/k/workspace/loans/src/loans.py", line 18, in testdata
    X   =   pd.read_table(filename, sep=',',    warn_bad_lines=True,    error_bad_lines=True)
  File "/usr/local/lib/python2.7/dist-packages/pandas/io/parsers.py", line 420, in parser_f
    return _read(filepath_or_buffer, kwds)
  File "/usr/local/lib/python2.7/dist-packages/pandas/io/parsers.py", line 225, in _read
    return parser.read()
  File "/usr/local/lib/python2.7/dist-packages/pandas/io/parsers.py", line 626, in read
    ret = self._engine.read(nrows)
  File "/usr/local/lib/python2.7/dist-packages/pandas/io/parsers.py", line 1070, in read
    data = self._reader.read(nrows)
  File "parser.pyx", line 727, in pandas.parser.TextReader.read (pandas/parser.c:6866)
  File "parser.pyx", line 777, in pandas.parser.TextReader._read_low_memory (pandas/parser.c:7452)
  File "parser.pyx", line 1788, in pandas.parser._concatenate_chunks (pandas/parser.c:20462)
MemoryError

文件大小为1 GB。 R打开它没有太多麻烦（这很奇怪，因为如果我理解正确，R比Python更高级别......）

我在Intel（R）Core（TM）i3 CPU 550 @ 3.20GHz上使用4GB RAM运行代码。我在Linux Ubuntu 12.04 32位上运行代码。

有什么技巧可以让它发挥作用吗？

谢谢！

尝试加载大型CSV文件时Python中的MemoryError

0 个答案: