使用numpy数组通过索引加速获取边缘矩阵

时间:2016-04-16 17:34:12

标签: python algorithm numpy matrix indexing

我想切割原始矩阵的边缘,并想知道是否有更快的方法。因为我需要多次使用相同的位置和position_u运行selectEdge函数,这意味着许多图表的索引不会改变?是否有可能生成一个可以解决所有问题的映射矩阵?

非常感谢

def selectEdge(positions, positions_u, originalMat, selectedMat):
    """ select Edge by neighbors of all points
    many to many
    m positions
    n positions
    would have m*n edges
    update selectedMat
    """
    for ele in positions:
        for ele_u in positions_u:            
            selectedMat[ele][ele_u] += originalMat[ele][ele_u]
            selectedMat[ele_u][ele] += originalMat[ele_u][ele]
    return selectedMat

我只需要上三角矩阵,因为它是对称的

def test_selectEdge(self):
        positions, positions_u = np.array([0,1,5,7]), np.array([2,3,4,6])
        originalMat, selectedMat = np.array([[1.0]*8]*8), np.array([[0.0]*8]*8)
        selectedMat = selectEdge(positions, positions_u, originalMat, selectedMat)
        print 'position, positions_u'
        print positions, positions_u
        print 'originalMat', originalMat
        print 'selectedMat', selectedMat

这是我的测试结果

position, positions_u
[0 1 5 7] [2 3 4 6]
originalMat 
[[ 1.  1.  1.  1.  1.  1.  1.  1.]
 [ 1.  1.  1.  1.  1.  1.  1.  1.]
 [ 1.  1.  1.  1.  1.  1.  1.  1.]
 [ 1.  1.  1.  1.  1.  1.  1.  1.]
 [ 1.  1.  1.  1.  1.  1.  1.  1.]
 [ 1.  1.  1.  1.  1.  1.  1.  1.]
 [ 1.  1.  1.  1.  1.  1.  1.  1.]
 [ 1.  1.  1.  1.  1.  1.  1.  1.]]
selectedMat 
[[ 0.  0.  1.  1.  1.  0.  1.  0.]
 [ 0.  0.  1.  1.  1.  0.  1.  0.]
 [ 1.  1.  0.  0.  0.  1.  0.  1.]
 [ 1.  1.  0.  0.  0.  1.  0.  1.]
 [ 1.  1.  0.  0.  0.  1.  0.  1.]
 [ 0.  0.  1.  1.  1.  0.  1.  0.]
 [ 1.  1.  0.  0.  0.  1.  0.  1.]
 [ 0.  0.  1.  1.  1.  0.  1.  0.]]

对于后一种选择邻居边缘的实现来说,它会更慢

def selectNeighborEdges(originalMat, selectedMat, relation):
    """ select Edge by neighbors of all points
    one to many
    Args:
        relation: dict, {node1:[node i, node j,...], node2:[node i, node j, ...]}

    update selectedMat
    """
    for key in relation:
        selectedMat = selectEdge([key], relation[key], originalMat, selectedMat)
    return selectedMat

1 个答案:

答案 0 :(得分:3)

您可以使用"advanced integer indexing"消除双for-loop

X, Y = positions[:,None], positions_u[None,:]
selectedMat[X, Y] += originalMat[X, Y]
selectedMat[Y, X] += originalMat[Y, X]

例如,

import numpy as np

def selectEdge(positions, positions_u, originalMat, selectedMat):
    for ele in positions:
        for ele_u in positions_u:
            selectedMat[ele][ele_u] += originalMat[ele][ele_u]
            selectedMat[ele_u][ele] += originalMat[ele_u][ele]
    return selectedMat

def alt_selectEdge(positions, positions_u, originalMat, selectedMat):
    X, Y = positions[:,None], positions_u[None,:]
    selectedMat[X, Y] += originalMat[X, Y]
    selectedMat[Y, X] += originalMat[Y, X]
    return selectedMat


N, M = 100, 50
positions = np.random.choice(np.arange(N), M, replace=False)
positions_u = np.random.choice(np.arange(N), M, replace=False)
originalMat = np.random.random((N, N))
selectedMat = np.zeros_like(originalMat)

首先检查selectEdgealt_selectEdge是否返回相同的结果:

expected = selectEdge(positions, positions_u, originalMat, selectedMat)
result = alt_selectEdge(positions, positions_u, originalMat, selectedMat)
assert np.allclose(expected, result)

这是一个timeit基准测试(使用IPython):

In [89]: %timeit selectEdge(positions, positions_u, originalMat, selectedMat)
100 loops, best of 3: 4.44 ms per loop

In [90]: %timeit alt_selectEdge(positions, positions_u, originalMat, selectedMat)
10000 loops, best of 3: 104 µs per loop