我有一个向量v
,它有很多0,只有几个非零项(大多数是1)。我有一个m
矩阵,它是正方形,对称的并且具有整数值。
以下脚本会生成类似的数据,并生成我感兴趣的部分的时间:
#!/usr/bin/env python
import time
import numpy as np
np.random.seed(0)
n = 20000
m = 20
assert n >= m
# Create the vector
v = [1] * m + [0] * (n - m)
np.random.shuffle(v)
v = np.array(v, np.int8)
# Create the matrix
m = np.random.randint(0, 256, size=(n, n), dtype=np.uint8)
m = (m + m.T) / 2
m = m.astype(np.uint8)
# Multiplication
t0 = time.time()
result = np.dot(v, m)
t1 = time.time()
# Check the results
print("result shape: {}".format(result.shape))
print("result[0]: {}".format(result[0])) # should be 1757
print('Time: {:0.2f}s'.format(t1 - t0))
我检查了上面脚本的几个变体:
| Variation | Time |
| ------------------------------------------------- | ------ |
| Original | 21.65s |
| (1) m = m.astype(np.float32) | 0.09s |
| (2) v = v.astype(np.uint8) | 4.87s |
| (3) v = v.astype(np.int16);m = m.astype(np.int16) | 5.77s |
| (4) = (3) + matmul instead of dot | 6.91s |
| (5) = (1) + matmul instead of dot | 0.09s |
看看我的测试结果,我有两个问题:
v = scipy.sparse.csr_matrix(v)
,但没有得到相同的结果)