对这些功能进行基准测试:
def divisors_optimized(number):
square_root = int(math.sqrt(number))
for divisor in range(1, square_root):
if number % divisor == 0:
yield divisor
yield number / divisor
if square_root ** 2 == number:
yield square_root
def number_of_divisors_optimized(number):
count = 0
square_root = int(math.sqrt(number))
for divisor in range(1, square_root):
if number % divisor == 0:
count += 2
if square_root ** 2 == number:
count += 1
return count
您可以看到两者的基本结构完全相同。
基准代码:
number = 9999999
for i in range(10):
print(f"iteration {i}:")
start = time.time()
result = list(utils.divisors_optimized(number))
end = time.time()
print(f'len(divisors_optimized) took {end - start} seconds and found {len(result)} divisors.')
start = time.time()
result = utils.number_of_divisors_optimized(number)
end = time.time()
print(f'number_of_divisors_optimized took {end - start} seconds and found {result} divisors.')
print()
输出:
iteration 0:
len(divisors_optimized) took 0.00019598007202148438 seconds and found 12 divisors.
number_of_divisors_optimized took 0.0001919269561767578 seconds and found 12 divisors.
iteration 1:
len(divisors_optimized) took 0.00019121170043945312 seconds and found 12 divisors.
number_of_divisors_optimized took 0.00020599365234375 seconds and found 12 divisors.
iteration 2:
len(divisors_optimized) took 0.000179290771484375 seconds and found 12 divisors.
number_of_divisors_optimized took 0.00019049644470214844 seconds and found 12 divisors.
iteration 3:
len(divisors_optimized) took 0.00019025802612304688 seconds and found 12 divisors.
number_of_divisors_optimized took 0.00020170211791992188 seconds and found 12 divisors.
iteration 4:
len(divisors_optimized) took 0.0001785755157470703 seconds and found 12 divisors.
number_of_divisors_optimized took 0.00017905235290527344 seconds and found 12 divisors.
iteration 5:
len(divisors_optimized) took 0.00022721290588378906 seconds and found 12 divisors.
number_of_divisors_optimized took 0.00020170211791992188 seconds and found 12 divisors.
iteration 6:
len(divisors_optimized) took 0.0001919269561767578 seconds and found 12 divisors.
number_of_divisors_optimized took 0.00018930435180664062 seconds and found 12 divisors.
iteration 7:
len(divisors_optimized) took 0.00017881393432617188 seconds and found 12 divisors.
number_of_divisors_optimized took 0.00017905235290527344 seconds and found 12 divisors.
iteration 8:
len(divisors_optimized) took 0.00017976760864257812 seconds and found 12 divisors.
number_of_divisors_optimized took 0.0001785755157470703 seconds and found 12 divisors.
iteration 9:
len(divisors_optimized) took 0.00024819374084472656 seconds and found 12 divisors.
number_of_divisors_optimized took 0.00020766258239746094 seconds and found 12 divisors.
你可以看到执行时间非常接近,每次都有利于。
有人可以向我解释如何从生成器中创建列表并检索其长度与迭代时的计数一样快吗?我的意思是,内存分配(list()
)不应该比分配昂贵得多吗?
我正在使用Python 3.6.3。
答案 0 :(得分:3)
你正在测试远比你正在制作的更多东西。 “找到因素”案例的int
与list
生成器操作的成本相比总计完成的工作量相形见绌。你正在执行超过3000个试验部门;十二个yield
s与十二个增加是对这种工作的笨重变化。用[{1}}替换添加/ yield
(无所事事),你会发现仍然在(大致)相同的时间内运行:
pass
使用def ignore_divisors_optimized(number):
square_root = int(math.sqrt(number))
for divisor in range(1, square_root):
if number % divisor == 0:
pass
if square_root ** 2 == number:
pass
的{{1}}魔法进行微基准测试:
ipython
%timeit
快一微秒的事实并不重要(重复测试的抖动高于一微秒);它们基本上都是相同的速度,因为> 99%的工作是循环和测试,而不是测试通过时所做的。
这是90/10优化规则的一个例子:大约90%的时间用在10%的代码中(在这种情况下,试验部门本身);在其他90%的代码中花费了10%。您正在优化90%的代码中的一小部分,这些代码在10%的时间内运行,并且没有帮助,因为绝大部分时间都花在>>> %timeit -r5 number_of_divisors_optimized(9999999)
266 µs ± 1.85 µs per loop (mean ± std. dev. of 5 runs, 1000 loops each)
>>> %timeit -r5 list(divisors_optimized(9999999))
267 µs ± 1.29 µs per loop (mean ± std. dev. of 5 runs, 1000 loops each)
>>> %timeit -r5 ignore_divisors_optimized(9999999)
267 µs ± 1.43 µs per loop (mean ± std. dev. of 5 runs, 1000 loops each)
行上。如果你删除那个测试,而只是循环遍历number_of_divisors
什么都不做,运行时在我的本地微基准测试中下降到~78μs,这意味着测试占用了近200μs的运行时间,超过了所有其余时间的两倍。放在一起的代码需要。
如果你想优化它,你需要看看加速试验分割线本身的方法(基本上意味着不同的Python解释器或使用Cython将其编译为C),或者运行该线的方法更少次数(例如,预计算可能的素因子达到某个界限,因此对于任何给定的输入,您可以避免测试非素数除数,然后根据已知素数因子及其多样性生成/计算非素因子的数量)。