我有一台32GB的计算机,csv文件为100万行乘4列(800MB)。当我运行代码时,Python仅使用大约1GB的内存,但是出现内存错误:
MemoryError: Unable to allocate array with shape (23459822,) and data type int64
注意:问题仅在Windows上运行,而Ubuntu在完全相同的代码下就消失了
相关代码:
elif light in entry:
df = pandas.read_csv('maps_android_light_raw_20190909.csv')
for i,g in df.groupby('device_id'):
output_file2 = path+f'{i}/LIGHT/'
if not os.path.exists(output_file2):
os.makedirs(output_file2)
g.to_csv(output_file2 + f'{i}.csv', index = False)
del df
完整的追溯:
Traceback (most recent call last):
File "light.py", line 49, in <module>
main()
File "light.py", line 33, in main
for i,g in df2:
File "C:\Python37\lib\site-packages\pandas\core\groupby\ops.py", line 164, in get_iterator
for key, (i, group) in zip(keys, splitter):
File "C:\Python37\lib\site-packages\pandas\core\groupby\ops.py", line 899, in __iter__
sdata = self._get_sorted_data()
File "C:\Python37\lib\site-packages\pandas\core\groupby\ops.py", line 918, in _get_sorted_data
return self.data.take(self.sort_idx, axis=self.axis)
File "pandas/_libs/properties.pyx", line 34, in pandas._libs.properties.CachedProperty.__get__
File "C:\Python37\lib\site-packages\pandas\core\groupby\ops.py", line 896, in sort_idx
return get_group_index_sorter(self.labels, self.ngroups)
File "C:\Python37\lib\site-packages\pandas\core\sorting.py", line 349, in get_group_index_sorter
sorter, _ = algos.groupsort_indexer(ensure_int64(group_index), ngroups)
File "pandas/_libs/algos.pyx", line 173, in pandas._libs.algos.groupsort_indexer
MemoryError: Unable to allocate array with shape (23459822,) and data type int64