我正在编写一种算法,我需要根据不同节点的群集分配来“折叠”或“减少”矩阵。但是,目前的实现是我的完整算法的瓶颈(在Visual Studio Python分析器中测试)。
def reduce_matrix(mat: np.matrix, cluster_ids: np.array) -> np.matrix:
"""Reduce node adjacency matrix.
Arguments:
mat: Adjacency matrix
cluster_ids: Cluster membership assignment per current node (integers)
Returns:
Reduced adjacency matrix
"""
ordered_nodes = np.argsort(cluster_ids)
counts = np.unique(cluster_ids, return_counts=True)[1]
ends = np.cumsum(counts)
starts = np.concatenate([[0], ends[:-1]])
clusters = [ordered_nodes[start:end] for start, end in zip(starts, ends)]
n_c = len(counts)
reduced = np.mat(np.zeros((n_c, n_c), dtype=int))
for a in range(n_c):
a_nodes = clusters[a]
for b in range(a + 1, n_c):
b_nodes = clusters[b]
reduced[a, b] = np.sum(mat[a_nodes, :][:, b_nodes])
reduced[b, a] = np.sum(mat[b_nodes, :][:, a_nodes])
return reduced
在矩阵中对任意行和列求和的最快方法是什么?
我认为双索引[a_nodes, :][:, b_nodes]
会创建矩阵的副本而不是视图,但我不确定是否有更快的解决方法......
答案 0 :(得分:2)
Numba可以非常自然的方式加快这项任务,没有排序问题。在这里,必须管理许多不规则的块,因此Numpy效率不高:
@numba.jit
def reduce_matrix2(mat, cluster_ids):
n_c=len(set(cluster_ids))
out = np.zeros((n_c, n_c), dtype=int)
for i,i_c in enumerate(cluster_ids):
for j,j_c in enumerate(cluster_ids):
out[i_c,j_c] += mat[i,j]
np.fill_diagonal(out,0)
return out
在5000x5000
席上:
In [40]: %timeit r=reduce_matrix2(mat,cluster_ids)
30.3 ms ± 5.34 ms per loop (mean ± std. dev. of 7 runs, 10 loop each)
答案 1 :(得分:1)
我们可以将它减少到一个循环,通过对更多的块进行求和,但是以np.add.reduceat
为间隔,这应该更有效。
实现看起来像这样 -
# Get ordered nodes
ordered_nodes = np.argsort(cluster_ids)
# Get indexed array
M = mat[np.ix_(ordered_nodes, ordered_nodes)]
# Get group boundaries on sorted cluster ids
sc = cluster_ids[ordered_nodes]
cut_idx = np.flatnonzero(np.r_[True, sc[1:] != sc[:-1], True])
# Setup output array
n_c = len(cut_idx)-1
out = np.zeros((n_c, n_c), dtype=mat.dtype)
# Per iteration perform reduction on chunks off indexed array M and
# defined by cut_idx as boundaries
for i, (s0, s1) in enumerate(zip(cut_idx[:-1], cut_idx[1:])):
out[i] = np.add.reduceat(M[s0:s1], cut_idx[:-1],axis=1).sum(0)
np.fill_diagonal(out,0)
建议的方法为func -
def addreduceat_app(mat, cluster_ids):
ordered_nodes = np.argsort(cluster_ids)
M = mat[np.ix_(ordered_nodes, ordered_nodes)]
sc = cluster_ids[ordered_nodes]
cut_idx = np.flatnonzero(np.r_[True, sc[1:] != sc[:-1], True])
n_c = len(cut_idx)-1
out = np.zeros((n_c, n_c), dtype=mat.dtype)
for i, (s0, s1) in enumerate(zip(cut_idx[:-1], cut_idx[1:])):
out[i] = np.add.reduceat(M[s0:s1], cut_idx[:-1],axis=1).sum(0)
np.fill_diagonal(out,0)
return np.matrix(out)
对5000
群集中500
为唯一群集的数据集进行计时和验证 -
In [518]: np.random.seed(0)
...: mat = np.random.randint(0,10,(5000,5000))
...: cluster_ids = np.random.randint(0,500,(5000))
In [519]: out1 = reduce_matrix(mat, cluster_ids)
...: out2 = addreduceat_app(mat, cluster_ids)
...: print np.allclose(out1, out2)
True
In [520]: %timeit reduce_matrix(mat, cluster_ids)
...: %timeit addreduceat_app(mat, cluster_ids)
1 loop, best of 3: 8.39 s per loop
10 loops, best of 3: 195 ms per loop