Question

我有一个~400MB的json文件，我希望将其转换为列和行的数据集。我正在使用以下代码在Jupyter Notebook中打开文件，但接收到MemoryError：

with open(r'file_path', encoding="utf8") as f:
     data = json.load(f)
df = pd.io.json.json_normalize(data['rows'])

错误：

MemoryError                               Traceback (most recent call last)
<ipython-input-2-79552ba3688b> in <module>()
      1 with open(r'file_path', encoding="utf8") as f:
  --> 2     data = json.load(f)
      3 df = pd.io.json.json_normalize(data['rows'])

      C:\Users\...\lib\json\__init__.py in load(fp, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
      294 
      295     """
  --> 296     return loads(fp.read(),
      297         cls=cls, object_hook=object_hook,
      298         parse_float=parse_float, parse_int=parse_int,

      C:\Users\...\lib\codecs.py in decode(self, input, final)
      319         # decode input (taking the buffer into account)
      320         data = self.buffer + input
  --> 321         (result, consumed) = self._buffer_decode(data, self.errors, final)
      322         # keep undecoded input until the next call
      323         self.buffer = data[consumed:]

MemoryError:

使用300KB文件，此代码可以正常运行。

我尝试使用32位和64位python而我的[windows]计算机有8GB Ram。

有关如何将文件打开到数据集的任何想法吗？

谢谢， NAZ

Python 3.6使用json文件返回load（fp.read（））MemoryError

0 个答案: