Python避免double for循环

时间:2018-09-13 17:08:12

标签: python numpy

我正在尝试找到一种方法,使用numpy来避免python中的double for循环,但是我不确定是否有可能。

我有两个3D矩阵,我已经使用numpy的flatten()展平为两个2D矩阵,现在我需要对每一行和每一行进行计算。基本上,每一行都代表一张图像,并且我要对两个向量进行一系列计算,然后返回一个标量。

A [a, b, c, d]     A' [a, b, c, d]
B [e, f, g, h]     B' [e, f, g, h]
C [i, j, k, l]     C' [i, j, k, l]
D [m, n, o, p]     D' [m, n, o, p]

result
[AA' AB' AC' AD']
[BA' BB' BC' BD']
[CA' CB' CC' CD']
[DA' DB' DC' DD']

编辑: 这是我的双for循环

aMatrix = np.array([[5, 3, 2, 1, 4, 2],
        [7, 0, 3, 5, 7, 9],
        [9, 8, 0, 2, 4, 8],
        [3, 5, 2, 0, 1, 9],
        [7, 7, 4, 1, 7, 6],
        [5, 9, 8, 9, 6, 1]])

find_distance_of_two_sets(aMatrix, aMatrix)

def find_distance_of_two_sets(aMatrix, bMatrix):
    distance = np.zeros((6, 6))
    i = 0
    for a in aMatrix:
        j = 0
        for b in bMatrix:
            distance[i][j] = euclidean_distance(a, b)
            j += 1
        i += 1
outputFile = open('distanceMatrix', 'wb')
np.save(outputFile, distance)

def euclidean_distance(a, b):
    return np.sqrt(np.sum(np.square(np.subtract(a, b))))

如果要打印结果,它将是

[[ 0.          9.38083152  9.05538514  8.18535277  7.         11.87434209]
 [ 9.38083152  0.          9.79795897 10.14889157  8.66025404 13.82027496]
 [ 9.05538514  9.79795897  0.          7.93725393  5.91607978 13.52774926]
 [ 8.18535277 10.14889157  7.93725393  0.          8.36660027 15.03329638]
 [ 7.          8.66025404  5.91607978  8.36660027  0.         10.67707825]
 [11.87434209 13.82027496 13.52774926 15.03329638 10.67707825  0.        ]]

1 个答案:

答案 0 :(得分:2)

利用向量化操作广播第二个数组。

设置

a = np.array([[5, 3, 2, 1, 4, 2],
        [7, 0, 3, 5, 7, 9],
        [9, 8, 0, 2, 4, 8],
        [3, 5, 2, 0, 1, 9],
        [7, 7, 4, 1, 7, 6],
        [5, 9, 8, 9, 6, 1]])

d = (a - a[:, None])**2
np.sqrt(d.sum(-1)).round(2)

array([[ 0.  ,  9.38,  9.06,  8.19,  7.  , 11.87],
       [ 9.38,  0.  ,  9.8 , 10.15,  8.66, 13.82],
       [ 9.06,  9.8 ,  0.  ,  7.94,  5.92, 13.53],
       [ 8.19, 10.15,  7.94,  0.  ,  8.37, 15.03],
       [ 7.  ,  8.66,  5.92,  8.37,  0.  , 10.68],
       [11.87, 13.82, 13.53, 15.03, 10.68,  0.  ]])

性能

a = np.random.rand(100, 100)

%%timeit
d = (a - a[:, None])**2
np.sqrt(d.sum(-1)).round(2)

7.68 ms ± 75.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

%%timeit
distance = np.zeros((100, 100))
for i, el1 in enumerate(a):
     for j, el2 in enumerate(a):
         distance[i][j] = np.sqrt(np.sum(np.square(np.subtract(el1, el2))))

51.1 ms ± 1.97 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)