我注意到python中的一个案例,当一个嵌套在循环中的代码块连续运行时,它比运行一些 .sleep()
时间间隔要快得多。
我想知道原因和可能的解决方案。
我猜它与CPU缓存或cPython VM的某种机制有关。
'''
Created on Aug 22, 2015
@author: doge
'''
import numpy as np
import time
import gc
gc.disable()
t = np.arange(100000)
for i in xrange(100):
#np.sum(t)
time.sleep(1) #--> if you comment this line, the following lines will be much faster
st = time.time()
np.sum(t)
print (time.time() - st)*1e6
结果:
without sleep in loop, time consumed: 50us
with a sleep in loop, time consumed: >150us
.sleep()
的一些缺点是,它会释放CPU,因此我提供了与 C
代码完全相同的版本:
'''
Created on Aug 22, 2015
@author: doge
'''
import numpy as np
import time
import gc
gc.disable()
t = np.arange(100000)
count = 0
for i in xrange(100):
count += 1
if ( count % 1000000 != 0 ):
continue
#--> these three lines make the following lines much slower
st = time.time()
np.sum(t)
print (time.time() - st)*1e6
另一项实验: (我们删除for循环)
st = time.time()
np.sum(t)
print (time.time() - st)*1e6
st = time.time()
np.sum(t)
print (time.time() - st)*1e6
st = time.time()
np.sum(t)
print (time.time() - st)*1e6
...
st = time.time()
np.sum(t)
print (time.time() - st)*1e6
结果:
execution time decreased from 150us -> 50us gradually.
and keep stable in 50us.
为了确定这是否是CPU缓存的问题,我写了 C
对应的。并且发现这种现象不会发生。
#include <iostream>
#include <sys/time.h>
#define num 100000
using namespace std;
long gus()
{
struct timeval tm;
gettimeofday(&tm, NULL);
return ( (tm.tv_sec % 86400 + 28800) % 86400 )*1000000 + tm.tv_usec;
}
double vec_sum(double *v, int n){
double result = 0;
for(int i = 0;i < n;++i){
result += v[i];
}
return result;
}
int main(){
double a[num];
for(int i = 0; i < num; ++i){
a[i] = (double)i;
}
//for(int i = 0; i < 1000; ++i){
// cout<<a[i]<<"\n";
//}
int count = 0;
long st;
while(1){
++count;
if(count%100000000 != 0){ //---> i use this line to create a delay, we can do the same way in python, result is the same
//if(count%1 != 0){
continue;
}
st = gus();
vec_sum(a,num);
cout<<gus() - st<<endl;
}
return 0;
}
结果:
time stable in 250us, no matter in "count%100000000" or "count%1"
答案 0 :(得分:1)
(不是答案 - 但要发表评论的时间太长)
我做了一些实验并通过timeit
运行(稍微简单一些)。
from timeit import timeit
import time
n_loop = 15
n_timeit = 10
sleep_sec = 0.1
t = range(100000)
def with_sleep():
for i in range(n_loop):
s = sum(t)
time.sleep(sleep_sec)
def without_sleep():
for i in range(n_loop):
s = sum(t)
def sleep_only():
for i in range(n_loop):
time.sleep(sleep_sec)
wo = timeit(setup='from __main__ import without_sleep',
stmt='without_sleep()',
number = n_timeit)
w = timeit(setup='from __main__ import with_sleep',
stmt='with_sleep()',
number = n_timeit)
so = timeit(setup='from __main__ import sleep_only',
stmt='sleep_only()',
number = n_timeit)
print(so - n_timeit*n_loop*sleep_sec, so)
print(w - n_timeit*n_loop*sleep_sec, w)
print(wo)
结果是:
0.031275457000447204 15.031275457000447
1.0220358229998965 16.022035822999896
0.41462676399987686
第一行只是检查睡眠功能是否使用大约n_timeit*n_loop*sleep_sec
秒。所以如果这个值小 - 那应该没问题。
但正如您所见 - 您的发现仍然存在:具有睡眠功能的循环(减去睡眠使用的时间)比没有睡眠的循环占用更多时间......
我不认为python在没有睡眠的情况下优化循环(c编译器可能;从不使用变量s
)。