我有一个~400MB的json文件,我希望将其转换为列和行的数据集。我正在使用以下代码在Jupyter Notebook中打开文件,但接收到MemoryError:
with open(r'file_path', encoding="utf8") as f:
data = json.load(f)
df = pd.io.json.json_normalize(data['rows'])
错误:
MemoryError Traceback (most recent call last)
<ipython-input-2-79552ba3688b> in <module>()
1 with open(r'file_path', encoding="utf8") as f:
--> 2 data = json.load(f)
3 df = pd.io.json.json_normalize(data['rows'])
C:\Users\...\lib\json\__init__.py in load(fp, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
294
295 """
--> 296 return loads(fp.read(),
297 cls=cls, object_hook=object_hook,
298 parse_float=parse_float, parse_int=parse_int,
C:\Users\...\lib\codecs.py in decode(self, input, final)
319 # decode input (taking the buffer into account)
320 data = self.buffer + input
--> 321 (result, consumed) = self._buffer_decode(data, self.errors, final)
322 # keep undecoded input until the next call
323 self.buffer = data[consumed:]
MemoryError:
使用300KB文件,此代码可以正常运行。
我尝试使用32位和64位python而我的[windows]计算机有8GB Ram。
有关如何将文件打开到数据集的任何想法吗?
谢谢, NAZ