Question

我有大的二维数组（通常为0.5到2GB），维数为n x 1008.这个数组包含几个图像，数组中的值实际上是像素值。基本上如何恢复这些图像如下

开始迭代数组。
取前260行，即260 * 1008 = 262080个值。
对于第261行，只取前64个值（该行中的其余值为垃圾）。因此，现在我们有262144像素值。
将所有这些值转储到一维数组中，例如dump并执行np.reshape（转储，（512,512）））以获取图像。请注意，512x512 = 262144
再次从第262行重复同样的事情。

这是我的解决方案

counter=0
dump=np.array([], dtype=np.uint16)
#pixelDat is the array shaped n x 1008 containing the pixel values
for j in xrange(len(pixelDat)):
    #Check if it is the last row for a particular image
    if(j == (260*(counter+1)+ counter)):
        counter += 1
        dump=np.append(dump, pixelDat[j][:64])
        #Reshape dump to form the image and write it to a fits file
        hdu = fits.PrimaryHDU(np.reshape(dump, (512,512)))
        hdu.writeto('img'+str("{0:0>4}".format(counter))+'.fits', clobber=True)
        #Clear dump to enable formation of next image
        dump=np.array([], dtype=np.uint16)
    else:
        dump=np.append(dump, pixelDat[j])

我一直想知道是否有办法加快整个过程。我想到的第一件事是使用矢量化的numpy操作。但是我不太确定在这种情况下如何应用它。

Answer 1

以下是使用展平和np.split的尝试。它避免了复制数据。

def chop_up(pixelDat):
    sh = pixelDat.shape
    try:
        # since the array is large we do not want a copy
        # the next line will succeed only if we can reshape in-place
        pixelDat.shape = -1
    except:
        return False # user must resort to other method
    N = len(pixelDat)
    split = (np.arange(0, N, 261*1008)[:, None] + (0, 512*512)).ravel()[1:]
    if split[-1] > N:
       split = split[:-2]
    result = [x.reshape(512,512) for x in np.split(pixelDat, split) if len(x) == 512*512]
    pixelDat.shape = sh
    return result

是否有更快的方法从numpy中使用矢量化操作从大型2-D阵列恢复图像

1 个答案: