Question

我有一个矩阵M：

M = [[10, 1000],
 [11, 200],
 [15, 800],
 [20, 5000],
 [28, 100],
 [32, 3000],
 [35, 3500],
 [38, 100],
 [50, 5000],
 [51, 100],
 [55, 2000],
 [58, 3000],
 [66, 4000],
 [90, 5000]]

还有矩阵R：

 [[10 20]
  [32 35]
  [50 66]
  [90 90]]

我想将矩阵R的第0列中的值用作切片的开始值，而将第1列中的值用作切片的结束。

我想从矩阵M的右列计算这些切片的范围之和，包括这些切片的范围。

基本做

  M[0:4][:,1].sum() # Upper index +1 as I need upper bound including
  M[5:7][:,1].sum() # Upper index +1 as I need upper bound including

，依此类推。 0是10的索引，3是20的索引。5是32的索引，6是35的索引。

我被困在如何从矩阵R的起始/结束值到矩阵M的第0列求指数。然后计算包括上下限的索引范围之和。

预期输出：

[[10, 20, 7000], # 7000 = 1000+200+800+5000
 [32, 35, 6500], # 6500 = 3000+3500
 [50, 66, 14100], # 14100 = 5000+100+2000+3000+4000
 [90, 90, 5000]] # 5000 = just 5000 as upper=lower boundary

更新，我现在可以使用searchsorted获得索引。现在，我只需要在开始和结束位置的矩阵M的第1列使用和。

 start_indices = [0,5,8,13]
 end_indices = [3,6,12,13]

想知道是否有比应用for循环更有效的方法？

编辑：在这里找到答案。 Numpy sum of values in subarrays between pairs of indices

Answer 1

使用searchsorted确定正确的索引，并使用add.reduceat执行求和：

>>> idx = M[:, 0].searchsorted(R) + (0, 1)
>>> idx = idx.ravel()[:-1] if idx[-1, 1] == M.shape[0] else idx.ravel()
>>> result = np.add.reduceat(M[:, 1], idx)[::2]
>>> result
array([ 7000,  6500, 14100,  5000])

详细信息：

由于您要包括上限，但Python排除了上限，因此我们必须添加1。

reduceat无法将len（arg0）作为索引处理，我们必须特例

reduceat计算连续边界之间的所有拉伸，我们必须互相丢弃

Answer 2

我认为最好显示您期望的输出示例。如果要使用M[0:4][:,1].sum()计算的结果是1000 + 200 + 800 + 5000的总和。那么此代码可能会有所帮助：

import numpy as np

M = np.matrix([[10, 1000],
 [11, 200],
 [15, 800],
 [20, 5000],
 [28, 100],
 [32, 3000],
 [35, 3500],
 [38, 100],
 [50, 5000],
 [51, 100],
 [55, 2000],
 [58, 3000],
 [66, 4000],
 [90, 5000]])


print(M[0:4][:,1].sum())

numpy：如何使用indees计算数组切片的总和？

2 个答案: