Question

我有一个numpy形状的数组，其中包含许多（本例中为200个）单色64x64像素图像，因此具有以下形状：

>>> a.shape
(200L, 1L, 64L, 64L)

我想将这些图片拆分为3个新数组a1，a2，a3，其中它们将分别包含80％，10％，10％的图像，而我我是按照以下方式进行的（我不希望它们在a中连续）：

import numpy as np
import random

a = --read images from file--

a1 = numpy.empty((0,1,64,64))
a2 = numpy.empty((0,1,64,64))
a3 = numpy.empty((0,1,64,64))

for i in range(200): #200 is the number of images
    temp = a[-1]
    a = np.delete(a,-1,0)
    rand = random.random()
    if rand < 0.8:
        a1 = np.append(a1,[temp],0)
    elsif rand < 0.9:
        a2 = np.append(a2,[temp],0)
    else:
        a3 = np.append(a3,[temp],0)

我尝试模拟pop和append在O(1)时在列表上完成，但对numpy数组保持不变吗？有没有办法更有效（更快）地为大量（数千）图像做到这一点？

Answer 1

这是使用np.vsplit -

的单线程

a1,a2,a3 = np.vsplit(a[np.random.permutation(a.shape[0])],(160,180))

1）形状检查：

In [205]: a = np.random.rand(200,1,64,64)

In [206]: a1,a2,a3 = np.vsplit(a[np.random.permutation(a.shape[0])],(160,180))

In [207]: a.shape
Out[207]: (200, 1, 64, 64)

In [208]: a1.shape
Out[208]: (160, 1, 64, 64)

In [209]: a2.shape
Out[209]: (20, 1, 64, 64)

In [210]: a3.shape
Out[210]: (20, 1, 64, 64)

2）对玩具数据进行价值检查以确保我们选择随机图像而非连续图像进行拆分：

In [212]: a
Out[212]: 
array([[5, 8, 4],
       [7, 7, 6],
       [3, 2, 7],
       [1, 4, 8],
       [4, 1, 0],
       [2, 1, 3],
       [6, 5, 2],
       [2, 4, 5],
       [6, 6, 5],
       [5, 2, 5]])

In [213]: a1,a2,a3 = np.vsplit(a[np.random.permutation(a.shape[0])],(6,8))

In [214]: a1
Out[214]: 
array([[1, 4, 8],
       [7, 7, 6],
       [6, 6, 5],
       [2, 4, 5],
       [4, 1, 0],
       [5, 2, 5]])

In [215]: a2
Out[215]: 
array([[3, 2, 7],
       [2, 1, 3]])

In [216]: a3
Out[216]: 
array([[6, 5, 2],
       [5, 8, 4]])

Numpy：随机拆分数组

1 个答案: