Question

我经常需要堆叠2d numpy数组（tiff图像）。为此，我首先将它们添加到列表中并使用np.dstack。这似乎是获取3D阵列堆叠图像的最快方法。但是，有更快/更有效的方式吗？

from time import time
import numpy as np

# Create 100 images of the same dimention 256x512 (8-bit). 
# In reality, each image comes from a different file
img = np.random.randint(0,255,(256, 512, 100))

t0 = time()
temp = []
for n in range(100):
    temp.append(img[:,:,n])
stacked = np.dstack(temp)
#stacked = np.array(temp)  # much slower 3.5 s for 100

print time()-t0  # 0.58 s for 100 frames
print stacked.shape

# dstack in each loop is slower
t0 = time()
temp = img[:,:,0]
for n in range(1, 100):
    temp = np.dstack((temp, img[:,:,n]))
print time()-t0  # 3.13 s for 100 frames
print temp.shape

# counter-intuitive but preallocation is slightly slower
stacked = np.empty((256, 512, 100))
t0 = time()
for n in range(100):
    stacked[:,:,n] = img[:,:,n]
print time()-t0  # 0.651 s for 100 frames
print stacked.shape

# (Edit) As in the accepted answer, re-arranging axis to mainly use 
# the first axis to access data improved the speed significantly.
img = np.random.randint(0,255,(100, 256, 512))

stacked = np.empty((100, 256, 512))
t0 = time()
for n in range(100):
    stacked[n,:,:] = img[n,:,:]
print time()-t0  # 0.08 s for 100 frames
print stacked.shape

Answer 1

经过与otterb的一些共同努力，我们得出结论，预先分配阵列是可行的方法。显然，性能查杀瓶颈是阵列布局，图像编号（n）是变化最快的索引。如果我们将n作为数组的第一个索引（默认为“C”排序：第一个索引changest最慢，最后一个索引变化最快），我们获得最佳性能：

from time import time
import numpy as np

# Create 100 images of the same dimention 256x512 (8-bit). 
# In reality, each image comes from a different file
img = np.random.randint(0,255,(100, 256, 512))

# counter-intuitive but preallocation is slightly slower
stacked = np.empty((100, 256, 512))
t0 = time()
for n in range(100):
    stacked[n] = img[n]
print time()-t0  
print stacked.shape

堆叠图像作为numpy数组更快（比预分配）？

1 个答案: