为NumPy数组有效地计算成对相等

时间:2018-01-08 20:18:28

标签: python arrays numpy

给出两个NumPy数组,比如说:

import numpy as np
import numpy.random as rand

n = 1000
x = rand.binomial(n=1, p=.5, size=(n, 10))
y = rand.binomial(n=1, p=.5, size=(n, 10))

以下是否有更有效的方法来计算X

X = np.zeros((n, n))
for i in range(n):
    for j in range(n):
        X[i, j] = 1 * np.all(x[i] == y[j])

1 个答案:

答案 0 :(得分:2)

方法#1:输入数组0s& 1s

对于仅包含0s1s的输入数组,我们可以将每个行减少为标量,从而将输入数组减少到1D然后利用broadcasting,就像所以 -

n = x.shape[1]        
s = 2**np.arange(n)
x1D = x.dot(s)
y1D = y.dot(s)
Xout = (x1D[:,None] == y1D).astype(float)

方法#2:通用案例

对于一般情况,我们可以使用views -

# https://stackoverflow.com/a/45313353/ @Divakar
def view1D(a, b): # a, b are arrays
    a = np.ascontiguousarray(a)
    b = np.ascontiguousarray(b)
    void_dt = np.dtype((np.void, a.dtype.itemsize * a.shape[1]))
    return a.view(void_dt).ravel(),  b.view(void_dt).ravel()

x1D, y1D = view1D(x, y)
Xout = (x1D[:,None] == y1D).astype(float)

运行时测试

# Setup
In [287]: np.random.seed(0)
     ...: n = 1000
     ...: x = rand.binomial(n=1, p=.5, size=(n, 10))
     ...: y = rand.binomial(n=1, p=.5, size=(n, 10))

# Original approach
In [288]: %%timeit
     ...: X = np.zeros((n, n))
     ...: for i in range(n):
     ...:     for j in range(n):
     ...:         X[i, j] = 1 * np.all(x[i] == y[j])
1 loop, best of 3: 4.69 s per loop

# Approach #1
In [290]: %%timeit
     ...: n = x.shape[1]        
     ...: s = 2**np.arange(n)
     ...: x1D = x.dot(s)
     ...: y1D = y.dot(s)
     ...: Xout = (x1D[:,None] == y1D).astype(float)
1000 loops, best of 3: 1.42 ms per loop

# Approach #2
In [291]: %%timeit
     ...: x1D, y1D = view1D(x, y)
     ...: Xout = (x1D[:,None] == y1D).astype(float)
100 loops, best of 3: 18.5 ms per loop