Question

我正在使用numpy来计算许多大型数组中的很多值，并跟踪最大值出现的位置。

特别想象一下，我有一个'计数'数组：

data = numpy.array([[ 5, 10, 3],
                    [ 6, 9, 12],
                    [13, 3,  9],
                    [ 9, 3,  1],
                    ...
                    ])
counts = numpy.zeros(data.shape, dtype=numpy.int)

data会发生很大变化，但我希望'计数'反映每个位置出现最大值的次数：

max_value_indices = numpy.argmax(data, axis=1)
# this is now [1, 2, 0, 0, ...] representing the positions of 10, 12, 13 and 9, respectively.

根据我对numpy广播的理解，我应该可以说：

counts[max_value_indices] += 1

我期望更新数组：

[[0, 1, 0],
 [0, 0, 1],
 [1, 0, 0],
 [1, 0, 0],
 ...
]

但是这会增加counts中所有值的所有值：

[[1, 1, 1],
 [1, 1, 1],
 [1, 1, 1],
 [1, 1, 1],
 ...
]

我也想，如果我将max_value_indices转换为100x1数组，它可能会起作用：

counts[max_value_indices[:,numpy.newaxis]] += 1

但这只会更新位置0,1和2中的元素：

[[1, 1, 1],
 [1, 1, 1],
 [1, 1, 1],
 [0, 0, 0],
 ...
]

我也很高兴将indices数组转换为0和1的数组，然后每次都将它添加到counts数组中，但我不知道如何构造它。

Answer 1

您可以使用所谓的advanced integer indexing（又名Multidimensional list-of-locations indexing）：

In [24]: counts[np.arange(data.shape[0]), 
                np.argmax(data, axis=1)] += 1

In [25]: counts
Out[25]: 
array([[0, 1, 0],
       [0, 0, 1],
       [1, 0, 0],
       [1, 0, 0]])

第一个数组np.arange(data.shape[0])指定行。第二个数组np.argmax(data, axis=1)指定列。

在多维numpy数组中按索引更新

1 个答案: