具有丢弃尺寸的高效numpy数组随机视图

时间:2017-09-22 16:05:12

标签: python numpy

对于计算机视觉训练目的,随机裁剪通常用作数据增强技术。在每次迭代时,生成一批随机作物并将其馈送到正在训练的网络。这需要高效,因为它是在每次训练迭代时完成的。

如果数据维度太多,则可能还需要随机维度选择。例如,可以在视频中选择随机帧。数据甚至可以有4个维度(3个空间+时间)或更多。

如何编写低维度随机视图的高效生成器?

从3D数据中获取2D视图的一个非常天真的版本,只有一个一个,可能是:

character_set_client: utf8

可能存在一个更优雅的解决方案,至少在哪里:

  • 在随机选择的维度上没有丑陋的if / else
  • 视图可以采用import numpy as np import numpy.random as nr def views(): # suppose `data` comes from elsewhere # data.shape is (n1, n2, n3) while True: drop_dim = nr.randint(0, 3) drop_dim_keep = nr.randint(0, shape[drop_dim]) selector = np.zeros(shape, dtype=bool) if drop_dim == 0: selector[drop_dim_keep, :, :] = 1 elif drop_dim == 1: selector[:, drop_dim_keep, :] = 1 else: selector[:, :, drop_dim_keep] = 1 yield np.squeeze(data[selector]) 整数参数并一次生成多个视图而无需循环
  • 未指定输入/输出数据的维度(例如可以执行3D - > 2D以及4D - > 2D)

1 个答案:

答案 0 :(得分:0)

我调整了你的功能,以澄清它在做什么:

def views():
    # suppose `data` comes from elsewhere
    # data.shape is (n1, n2, n3)
    while True:
        drop_dim = nr.randint(0, 3)
        dropshape = list(shape[:])
        dropshape[drop_dim] -= 1
        drop_dim_keep = nr.randint(0, shape[drop_dim])
        print(drop_dim, drop_dim_keep)
        selector = np.ones(shape, dtype=bool)
        if drop_dim == 0:
            selector[drop_dim_keep, :, :] = 0
        elif drop_dim == 1:
            selector[:, drop_dim_keep, :] = 0
        else:
            selector[:, :,  drop_dim_keep] = 0
        yield data[selector].reshape(dropshape)

小样本运行:

In [534]: data = np.arange(24).reshape(shape)
In [535]: data
Out[535]: 
array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]],

       [[12, 13, 14, 15],
        [16, 17, 18, 19],
        [20, 21, 22, 23]]])
In [536]: v = views()
In [537]: next(v)
2 1
Out[537]: 
array([[[ 0,  2,  3],
        [ 4,  6,  7],
        [ 8, 10, 11]],

       [[12, 14, 15],
        [16, 18, 19],
        [20, 22, 23]]])
In [538]: next(v)
0 0
Out[538]: 
array([[[12, 13, 14, 15],
        [16, 17, 18, 19],
        [20, 21, 22, 23]]])

因此,它选择了其中一个维度,并为该维度删除了一个“#”列。

主要的效率问题是它是否返回视图或副本。在这种情况下,它必须返回一份副本。

您正在使用布尔掩码来选择返回,与此情况下np.delete完全相同。

In [544]: np.delete(data,1,2).shape
Out[544]: (2, 3, 3)
In [545]: np.delete(data,0,0).shape
Out[545]: (1, 3, 4)

因此,您可以使用delete替换您的大部分内容,让它来处理尺寸的概括。看看它的代码,看看它是如何处理这些细节的(它不短而且很甜!)。

def rand_delete():
    # suppose `data` comes from elsewhere
    # data.shape is (n1, n2, n3)
    while True:
        drop_dim = nr.randint(0, 3)
        drop_dim_keep = nr.randint(0, shape[drop_dim])
        print(drop_dim, drop_dim_keep)
        yield np.delete(data, drop_dim_keep, drop_dim)

In [547]: v1=rand_delete()
In [548]: next(v1)
0 1
Out[548]: 
array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]]])
In [549]: next(v1)
2 0
Out[549]: 
array([[[ 1,  2,  3],
        [ 5,  6,  7],
        [ 9, 10, 11]],

       [[13, 14, 15],
        [17, 18, 19],
        [21, 22, 23]]])

delete替换为take

def rand_take():
    while True:
        take_dim = nr.randint(0, 3)
        take_keep = nr.randint(0, shape[take_dim])
        print(take_dim, take_keep)
        yield np.take(data, take_keep, axis=take_dim)

In [580]: t = rand_take()
In [581]: next(t)
0 0
Out[581]: 
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])
In [582]: next(t)
2 3
Out[582]: 
array([[ 3,  7, 11],
       [15, 19, 23]])

np.take返回一个副本,但等效切片不是

In [601]: data.__array_interface__['data']
Out[601]: (182632568, False)
In [602]: np.take(data,0,1).__array_interface__['data']
Out[602]: (180099120, False)
In [603]: data[:,0,:].__array_interface__['data']
Out[603]: (182632568, False)

可以使用

之类的表达式生成切片元组
In [604]: idx = [slice(None)]*data.ndim
In [605]: idx[1] = 0
In [606]: data[tuple(idx)]
Out[606]: 
array([[ 0,  1,  2,  3],
       [12, 13, 14, 15]])

采用numpy参数的各种axis函数构造一个像这样的索引元组。 (例如,apply...函数中的一个或多个。