子集化2d numpy数组并保持行一致

时间:2017-08-19 01:55:13

标签: python arrays numpy indexing slice

我想知道执行以下操作的最简单方法是:

假设我们有以下2d数组:

voxel.position.set( x, y, z );

但是,我想保留之前关于第一个或多个行的结构 - 即:

>>> a = np.array([['z', 'z', 'z', 'f', 'z','f', 'f'], ['z', 'z', 'z', 'f', 'z','f', 'f']])

array([['z', 'z', 'z', 'f', 'z', 'f', 'f'],
   ['z', 'z', 'z', 'f', 'z', 'f', 'f']],
  dtype='<U1')



>>> b = np.array(range(0,14)).reshape(2, -1)


array([[ 0,  1,  2,  3,  4,  5,  6],
       [ 7,  8,  9, 10, 11, 12, 13]])


>>> idxs = list(zip(*np.where(a == 'f')))

[(0, 3), (0, 5), (0, 6), (1, 3), (1, 5), (1, 6)]


>>> [b[x] for x in idxs]

[3, 5, 6, 10, 12, 13]

有没有办法轻松保持这种结构?

4 个答案:

答案 0 :(得分:2)

使用for循环:

[b[i][a[i] == 'f'] for i in range(len(a))]
# [array([3, 5, 6]), array([10, 12, 13])]

答案 1 :(得分:2)

这是一个更复杂但纯粹的NumPy解决方案:

  1. 获取指数(展平版a),其中'f'
  2. 获取新行开始的索引
  3. 从1中找到属于一行的索引
  4. 在这些索引处拆分数组。
  5. 代码如下所示:

    >>> indices = np.flatnonzero(a.ravel() == 'f')
    >>> rows = np.arange(1, a.shape[0])*a.shape[1]
    >>> np.split(b.ravel()[indices], np.searchsorted(indices, rows))
    [array([3, 5, 6], dtype=int64), array([10, 12, 13], dtype=int64)]
    

    比其他解决方案稍长,我不确定它是否会更快 1

    虽然,就个人而言,我会使用列表理解和zip

    [b_row[a_row] for a_row, b_row in zip(a == 'f', b)]
    

    它更短,根据我的时间表现非常高效。

    定时:

    import numpy as np
    a = np.array([['z', 'z', 'z', 'f', 'z','f', 'f']]*10000)
    b = np.arange(a.size).reshape(-1, a.shape[1])
    
    %%timeit
    
    indices = np.flatnonzero(a.ravel() == 'f')
    rows = np.arange(1, a.shape[0])*a.shape[1]
    np.split(b.ravel()[indices], np.searchsorted(indices, rows))
    
      

    每循环123 ms±8.25 ms(7次运行的平均值±标准偏差,每次10次循环)

    %timeit [b[i][a[i] == 'f'] for i in range(len(a))]
    
      

    每循环162 ms±14 ms(平均值±标准偏差,7次运行,每次循环1次)

    但与my suggestion at Psidoms answer相比要慢得多:

    %timeit [b_row[a_row] for a_row, b_row in zip(a == 'f', b)]
    
      

    每循环44.9 ms±1.93 ms(平均值±标准偏差,7次运行,每次循环10次)

答案 2 :(得分:1)

a = np.array([['z', 'z', 'z', 'f', 'z','f', 'f'], ['z', 'z', 'z', 'f', 'z','f', 'f']])

b = np.array(range(0,14)).reshape(2, -1)

idxs = list(zip(*np.where(a == 'f')))


c=[[],[]]
for x in idxs:
    c[x[0]].append(b[x])

print c

答案 3 :(得分:1)

In [89]: idx = np.where(a == 'f')
In [90]: idx
Out[90]: 
(array([0, 0, 0, 1, 1, 1], dtype=int32),
 array([3, 5, 6, 3, 5, 6], dtype=int32))

我们可以应用where元组来选择b中的项目:

In [93]: b[idx]
Out[93]: array([ 3,  5,  6, 10, 12, 13])

等效地应用布尔掩码:

In [94]: b[a == 'f']
Out[94]: array([ 3,  5,  6, 10, 12, 13])

np.argwhere采用where的转置,生成与idxs类似的二维数组。

In [95]: np.argwhere(a == 'f')
Out[95]: 
array([[0, 3],
       [0, 5],
       [0, 6],
       [1, 3],
       [1, 5],
       [1, 6]], dtype=int32)

Delete all elements in an array corresponding to Boolean mask所述,我们通常无法选择带掩码的元素,并保留某种二维结构。在特定情况下,我们可以将1d结果重塑为有意义的结果。

In [96]: b[idx].reshape(2,-1)
Out[96]: 
array([[ 3,  5,  6],
       [10, 12, 13]])

逐行收集这些值并在每行中允许不同大小结果的简单方法是迭代:

In [100]: [j[i=='f'] for i,j in zip(a,b)]
Out[100]: [array([3, 5, 6]), array([10, 12, 13])]
In [101]: [j[i=='f'].tolist() for i,j in zip(a,b)]
Out[101]: [[3, 5, 6], [10, 12, 13]]