计算满足谓词的序列元素的最快方法

时间:2018-02-10 10:12:19

标签: python performance generator list-comprehension

我想计算在一次性序列中验证某个属性的元素。我有点惊讶发电机表达式不是最快的:

from random import random
l = [random() for i in range(1000000)]
%timeit len([None for x in l if x < 0.5])
%timeit len([x for x in l if x < 0.5])
%timeit sum(1 for x in l if x < 0.5)
%timeit sum(x < 0.5 for x in l)

测量表现:

90.7 ms ± 7.59 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
97.7 ms ± 7.23 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
105 ms ± 3.66 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
178 ms ± 2.38 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

有更快的方法吗?

1 个答案:

答案 0 :(得分:2)

如果转换为NumPy数组本身不计算,则可以使用NumPy:

import numpy as np
a = np.array(l)
%timeit np.sum(a < 0.5)

1.28 ms ± 48.7 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

即使将转换考虑在内也会显着加快速度:

%%timeit 
a = np.array(l)
np.sum(a < 0.5)

27.2 ms ± 433 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

与我机器上的纯Python版本相比。

from random import random
l = [random() for i in range(1000000)]
%timeit len([None for x in l if x < 0.5])
%timeit len([x for x in l if x < 0.5])
%timeit sum(1 for x in l if x < 0.5)
%timeit sum(x < 0.5 for x in l)

46.4 ms ± 941 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
48.1 ms ± 1.25 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
59.5 ms ± 811 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
103 ms ± 1.49 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)