来自形状的Numpy Broadcast指数

时间:2014-11-28 22:27:19

标签: python arrays numpy indexing

我有两个可以互相播放的阵列形状。

e.g。 (2,2,1)和(2,3)

我想要一个接受这些形状的函数,并给我一个迭代器,它返回这些数组的索引,这些形状将一起广播,并在结果输出数组中显示索引。

iter, output_shape = broadcast_indeces_iterator((2, 2, 1), (2, 3))
assert output_shape == (2, 2, 3)
for in1_ix, in_2_ix, out_ix in iter:
    print (in1_ix, in_2_ix, out_ix)    

导致输出:

(0, 0, 0), (0, 0), (0, 0, 0)
(0, 0, 0), (0, 1), (0, 0, 1)
(0, 0, 0), (0, 2), (0, 0, 2)
(0, 1, 0), (1, 0), (0, 1, 0)
(0, 1, 0), (1, 1), (0, 1, 1)
(0, 1, 0), (1, 2), (0, 1, 2)
(1, 0, 0), (0, 0), (1, 0, 0)
(1, 0, 0), (0, 1), (1, 0, 1)
(1, 0, 0), (0, 2), (1, 0, 2)
(1, 1, 0), (1, 0), (1, 1, 0)
(1, 1, 0), (1, 1), (1, 1, 1)
(1, 1, 0), (1, 2), (1, 1, 2)

np.broadcast关闭但需要实际创建的数组。

  • NumPy人员的注意事项:np.broadcast有一个额外的参数可以让你不要在最后2个维度上进行迭代。这也可以解决我的问题。

3 个答案:

答案 0 :(得分:3)

import numpy as np
x = 10*np.arange(4).reshape((2, 2, 1))
y = 100*np.arange(6).reshape((2, 3))

z = np.nditer([x, y], flags=['multi_index', 'c_index'], order='C')
for a,b in z:
    print(np.unravel_index(z.index % x.size, x.shape)
          , np.unravel_index(z.index % y.size, y.shape)
          , z.multi_index)

产量

((0, 0, 0), (0, 0), (0, 0, 0))
((0, 1, 0), (0, 1), (0, 0, 1))
((1, 0, 0), (0, 2), (0, 0, 2))
((1, 1, 0), (1, 0), (0, 1, 0))
((0, 0, 0), (1, 1), (0, 1, 1))
((0, 1, 0), (1, 2), (0, 1, 2))
((1, 0, 0), (0, 0), (1, 0, 0))
((1, 1, 0), (0, 1), (1, 0, 1))
((0, 0, 0), (0, 2), (1, 0, 2))
((0, 1, 0), (1, 0), (1, 1, 0))
((1, 0, 0), (1, 1), (1, 1, 1))
((1, 1, 0), (1, 2), (1, 1, 2))

答案 1 :(得分:1)

彼得真是个好问题。这是你的答案:

import numpy as np


def get_broadcast_shape(*shapes):
    '''
    Given a set of array shapes, return the shape of the output when arrays of those 
    shapes are broadcast together
    '''
    max_nim = max(len(s) for s in shapes)
    equal_len_shapes = np.array([(1, )*(max_nim-len(s))+s for s in shapes]) 
    max_dim_shapes = np.max(equal_len_shapes, axis = 0)
    assert np.all(np.bitwise_or(equal_len_shapes==1, equal_len_shapes == max_dim_shapes[None, :])), \
        'Shapes %s are not broadcastable together' % (shapes, )
    return tuple(max_dim_shapes)


def get_broadcast_indeces(*shapes):
    '''
    Given a set of shapes of arrays that you could broadcast together, return:
        output_shape: The shape of the resulting output array
        broadcast_shape_iterator: An iterator that returns a len(shapes)+1 tuple
            of the indeces of each input array and their corresponding index in the 
            output array
    '''
    output_shape = get_broadcast_shape(*shapes)
    base_iter = np.ndindex(output_shape)

    def broadcast_shape_iterator():
        for out_ix in base_iter:
            in_ixs = tuple(tuple(0 if s[i] == 1 else ix for i, ix in enumerate(out_ix[-len(s):])) for s in shapes)
            yield in_ixs + (out_ix, )

    return output_shape, broadcast_shape_iterator()


output_shape, ix_iter = get_broadcast_indeces((2, 2, 1), (2, 3))
assert output_shape == (2, 2, 3)
for in1_ix, in_2_ix, out_ix in ix_iter:
    print (in1_ix, in_2_ix, out_ix)

返回

((0, 0, 0), (0, 0), (0, 0, 0))
((0, 0, 0), (0, 1), (0, 0, 1))
((0, 0, 0), (0, 2), (0, 0, 2))
((0, 1, 0), (1, 0), (0, 1, 0))
((0, 1, 0), (1, 1), (0, 1, 1))
((0, 1, 0), (1, 2), (0, 1, 2))
((1, 0, 0), (0, 0), (1, 0, 0))
((1, 0, 0), (0, 1), (1, 0, 1))
((1, 0, 0), (0, 2), (1, 0, 2))
((1, 1, 0), (1, 0), (1, 1, 0))
((1, 1, 0), (1, 1), (1, 1, 1))
((1, 1, 0), (1, 2), (1, 1, 2))

如果有人知道任何解决这个问题的numpy内置组件,那会更好。

答案 2 :(得分:1)

这是一个开始:

array1 = np.arange(4).reshape(2,2,1)*10
array2 = np.arange(6).reshape(2,3)

I, J = np.broadcast_arrays(array1, array2)
print I.shape
K = np.empty(I.shape, dtype=int)
for ijk in np.ndindex(I.shape):
    K[ijk] = I[ijk]+J[ijk]
print K
制造

(2, 2, 3)  # broadcasted shape

[[[ 0  1  2]
  [13 14 15]]    
 [[20 21 22]
  [33 34 35]]]

I(2,2,3),但与array1共享其数据 - 它是广播的视图,而非副本(请查看其.__array_interface__)。

您可以通过仅为这些形状提供ndindex来迭代2个维度。

K = np.empty(I.shape, dtype=int)
for i,j in np.ndindex(I.shape[:2]):
    K[i,j,:] = I[i,j,:]+J[i,j,:]
    print K[i,j,:]

可以通过查看broadcast_arraysndindex的代码来查找基本部分。例如,在https://stackoverflow.com/a/25097271/901925中,我直接调用nditer来生成multi_index(可以适应cython的操作)。

xx = np.zeros(y.shape[:2])
it = np.nditer(xx,flags=['multi_index'])                               
while not it.finished:
    print y[it.multi_index],
    it.iternext()
# [242  14 211] [198   7   0] [235  60  81] [164  64 236]

制作虚拟阵列'这几乎是无成本的,我可以从ndindex获取线索,并以np.zeros(1)

开头制作数组
def make_dummy(shape):
    x = as_strided(np.zeros(1),shape=shape, strides=np.zeros_like(shape))
    return x
array1 = make_dummy((2,2,1))
array2 = make_dummy((2,3))

我可以深入研究np.broadcast_arrays,了解它如何将2个输入数组中的形状组合起来,形成I的形状。


我想要的解决方案与我的解决方案之间存在差异。

(0, 0, 0), (0, 0), (0, 0, 0)
(0, 0, 0), (0, 1), (0, 0, 1)
...
(1, 1, 0), (1, 1), (1, 1, 1)
(1, 1, 0), (1, 2), (1, 1, 2)

期望每个数组有一个不同的迭代器元组,一个范围超过(2,2,1),另一个范围超过(2,3)等。

我的方法,我认为是numpy c代码(至少那些基于nditer的部分)使用的方法,在(2,2,3)上生成一个迭代器,并通过as_strided接受更大的范围。以这种方式实现一般广播机制更容易。它将广播复杂性与计算核心区分开来。

这是nditer的好介绍:

http://docs.scipy.org/doc/numpy/reference/arrays.nditer.html