numpy.memmap x32机器上的最大数组大小?

时间:2013-10-23 06:17:47

标签: python arrays memory out-of-memory bigdata

我在x32 win xp上使用python x32

有时程序在线失败

fp = np.memmap('C:/memmap_test', dtype='float32', mode='w+', shape=(rows,cols))
memmap.py

中的

错误

Traceback (most recent call last):
    fp = np.memmap('C:/memmap_test', dtype='float32', mode='w+', shape=(rows,cols))   File "C:\Python27\lib\site-packages\numpy\core\memmap.py", line 253, in __new__
    mm = mmap.mmap(fid.fileno(), bytes, access=acc, offset=start)
OverflowError: cannot fit 'long' into an index-sized integer

所以我假设数组的大小有限制,那么数组maxN = rows * cols的最大大小是多少?

也是同样的问题 1. python x32 win x64和2. python x64 win x64。

更新

#create array
rows= 250000
cols= 1000
fA= np.memmap('A.npy', dtype='float32', mode='w+', shape=(rows,cols))
# fA1= np.memmap('A1.npy', dtype='float32', mode='w+', shape=(rows,cols)) # can't create another one big memmap
print fA.nbytes/1024/1024 # 953 mb

所以似乎还存在另一个限制,不仅仅是单个memmaped数组的2Gb。

也输出@Paul提供的测试

working with 30000000 elements
number bytes required 0.240000 GB
works
working with 300000000 elements
number bytes required 2.400000 GB
OverflowError("cannot fit 'long' into an index-sized integer",)
working with 3000000000 elements
number bytes required 24.000000 GB
IOError(28, 'No space left on device')
working with 30000000000 elements
number bytes required 240.000000 GB
IOError(28, 'No space left on device')
working with 300000000000 elements
number bytes required 2400.000000 GB
IOError(28, 'No space left on device')
working with 3000000000000 elements
number bytes required 24000.000000 GB
IOError(22, 'Invalid argument')

1 个答案:

答案 0 :(得分:2)

以下是有关此主题的一些讨论:How big can a memory-mapped file be?Why doesn't Python's mmap work with large files?

对于以下测试,我使用以下代码:

baseNumber = 3000000L

for powers in arange(1,7):
  l1 = baseNumber*10**powers
  print('working with %d elements'%(l1))
  print('number bytes required %f GB'%(l1*8/1e9))
  try:
    fp = numpy.memmap('test.map',dtype='float64', mode='w+',shape=(1,l1))
    #works 
    print('works')
    del fp
  except Exception as e:
    print(repr(e))
Windows x32上的

python x32 使用32位窗口时,文件大小限制在2-3GB左右。因此,由于操作系统限制,任何大于此文件大小窗口的内容都无法创建。我没有访问x32位机器,但是在命中文件大小限制后命令将失败

Windows x64上的

python x32

在这种情况下,由于python是32位,我们无法达到win64允许的文件大小。

%run -i scratch.py

python x32 win x64
working with 30000000 elements
number bytes required 0.240000 GB
works
working with 300000000 elements
number bytes required 2.400000 GB
OverflowError("cannot fit 'long' into an index-sized integer",)
working with 3000000000 elements
number bytes required 24.000000 GB
OverflowError("cannot fit 'long' into an index-sized integer",)
working with 30000000000 elements
number bytes required 240.000000 GB
IOError(28, 'No space left on device')
working with 300000000000 elements
number bytes required 2400.000000 GB
IOError(28, 'No space left on device')
working with 3000000000000 elements
number bytes required 24000.000000 GB
IOError(22, 'Invalid argument')
Windows x64上的

python x64

在这种情况下,我们最初受磁盘大小的限制,但是一旦我们的数组/字节大小足够大,它就会出现一些溢出

%run -i scratch.py
working with 30000000 elements
number bytes required 0.240000 GB
works
working with 300000000 elements
number bytes required 2.400000 GB
works
working with 3000000000 elements
number bytes required 24.000000 GB
works
working with 30000000000 elements
number bytes required 240.000000 GB
IOError(28, 'No space left on device')
working with 300000000000 elements
number bytes required 2400.000000 GB
IOError(28, 'No space left on device')
working with 3000000000000 elements
number bytes required 24000.000000 GB
IOError(22, 'Invalid argument')

总结: 阵列失败的精确点将取决于最初用于Windows x64的磁盘大小

pythonx32 windows x64 最初我们有你看到的类型错误,然后是磁盘大小限制,但在某些时候会引发无效的参数错误

pythonx64 windows x64 最初我们有磁盘大小限制,但在某些时候会引发其他错误 有趣的是,这些错误似乎与2 <64>大小问题无关,因为3000000000000 * 8&lt; 2 64就像这些错误在win32上表现出来一样。

如果磁盘足够大,那么我们就不会看到无效的参数错误,我们可以达到2 ** 64的限制,尽管我没有足够大的磁盘来测试它:)