我想计算在一次性序列中验证某个属性的元素。我有点惊讶发电机表达式不是最快的:
from random import random
l = [random() for i in range(1000000)]
%timeit len([None for x in l if x < 0.5])
%timeit len([x for x in l if x < 0.5])
%timeit sum(1 for x in l if x < 0.5)
%timeit sum(x < 0.5 for x in l)
测量表现:
90.7 ms ± 7.59 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
97.7 ms ± 7.23 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
105 ms ± 3.66 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
178 ms ± 2.38 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
有更快的方法吗?
答案 0 :(得分:2)
如果转换为NumPy数组本身不计算,则可以使用NumPy:
import numpy as np
a = np.array(l)
%timeit np.sum(a < 0.5)
1.28 ms ± 48.7 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
即使将转换考虑在内也会显着加快速度:
%%timeit
a = np.array(l)
np.sum(a < 0.5)
27.2 ms ± 433 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
与我机器上的纯Python版本相比。
from random import random
l = [random() for i in range(1000000)]
%timeit len([None for x in l if x < 0.5])
%timeit len([x for x in l if x < 0.5])
%timeit sum(1 for x in l if x < 0.5)
%timeit sum(x < 0.5 for x in l)
46.4 ms ± 941 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
48.1 ms ± 1.25 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
59.5 ms ± 811 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
103 ms ± 1.49 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)