Question

我有一个大小为（d，N）的矩阵X.换句话说，存在N个向量，每个向量具有d维。例如，

X = [[1,2,3,4],[5,6,7,8]]

有N = 4个d = 2维的向量。

另外，我有rag数组（列表列表）。索引是X矩阵中的索引列。例如，

I = [ [0,1], [1,2,3] ]

I [0] = [0,1]索引矩阵X中的第0列和第1列。类似地，元素I [1]索引第1,2和3列。请注意，I的元素是不属于的列表。相同的长度！

我想要做的是使用I中的每个元素索引矩阵X中的列，对向量求和并得到向量。对I的每个元素重复此操作，从而构建一个新的矩阵Y.矩阵Y应该具有与I数组中的元素一样多的d维向量。在我的例子中，Y矩阵将有2个2维向量。

在我的例子中，元素I [0]告诉从矩阵X得到列0和1.将矩阵X的两个向量二维向量相加并将该向量放在Y（第0列）中。然后，元素I [1]告诉对矩阵X的第1,2和3列求和，并将这个新的向量放在Y（第1列）中。

我可以使用循环轻松完成此操作，但如果可能的话，我想对此操作进行矢量化。我的矩阵X有数十万列，I索引矩阵有数万个元素（每个元素都是一个简短的索引列表）。

我的循环代码：

Y = np.zeros( (d,len(I)) )
for i,idx in enumerate(I):
    Y[:,i] = np.sum( X[:,idx], axis=1 )

Answer 1

这是一种方法 -

# Get a flattened version of indices
idx0 = np.concatenate(I)

# Get indices at which we need to do "intervaled-summation" along axis=1
cut_idx = np.append(0,map(len,I))[:-1].cumsum()

# Finally index into cols of array with flattend indices & perform summation
out = np.add.reduceat(X[:,idx0], cut_idx,axis=1)

分步运行 -

In [67]: X
Out[67]: 
array([[ 1,  2,  3,  4],
       [15,  6, 17,  8]])

In [68]: I
Out[68]: array([[0, 2, 3, 1], [2, 3, 1], [2, 3]], dtype=object)

In [69]: idx0 = np.concatenate(I)

In [70]: idx0 # Flattened indices
Out[70]: array([0, 2, 3, 1, 2, 3, 1, 2, 3])

In [71]: cut_idx = np.append(0,map(len,I))[:-1].cumsum()

In [72]: cut_idx # We need to do addition in intervals limited by these indices
Out[72]: array([0, 4, 7])

In [74]: X[:,idx0]  # Select all of the indexed columns
Out[74]: 
array([[ 1,  3,  4,  2,  3,  4,  2,  3,  4],
       [15, 17,  8,  6, 17,  8,  6, 17,  8]])

In [75]: np.add.reduceat(X[:,idx0], cut_idx,axis=1)
Out[75]: 
array([[10,  9,  7],
       [46, 31, 25]])

Vectorize numpy索引并应用函数来构建矩阵

1 个答案: