我试图通过取出子列表长度的复杂公式来进一步优化素数线程中的冠军解决方案。同一子序列的len()太慢,因为len很昂贵并且生成子序列很昂贵。这看起来稍微加快了功能,但我还是不能带走除法,尽管我只在条件语句中进行除法。当然我可以尝试通过优化n的起始标记而不是n * n来简化长度计算...
我将分区/整数除法//替换为与Python 3或
兼容from __future__ import division
如果这个递推公式可以帮助加速numpy解决方案,我也会感兴趣,但我没有多少使用numpy的经验。
如果为代码启用psyco,故事会变得完全不同,但Atkins筛选代码变得比这种特殊的切片技术更快。
import cProfile
def rwh_primes1(n):
# http://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188
""" Returns a list of primes < n """
sieve = [True] * (n//2)
for i in xrange(3,int(n**0.5)+1,2):
if sieve[i//2]:
sieve[i*i//2::i] = [False] * ((n-i*i-1)//(2*i)+1)
return [2] + [2*i+1 for i in xrange(1,n/2) if sieve[i]]
def primes(n):
# http://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188
# recurrence formula for length by amount1 and amount2 Tony Veijalainen 2010
""" Returns a list of primes < n """
sieve = [True] * (n//2)
amount1 = n-10
amount2 = 6
for i in xrange(3,int(n**0.5)+1,2):
if sieve[i//2]:
## can you make recurrence formula for whole reciprocal?
sieve[i*i//2::i] = [False] * (amount1//amount2+1)
amount1-=4*i+4
amount2+=4
return [2] + [2*i+1 for i in xrange(1,n//2) if sieve[i]]
numprimes=1000000
print('Profiling')
cProfile.Profile.bias = 4e-6
for test in (rwh_primes1, primes):
cProfile.run("test(numprimes)")
分析(版本之间差异不大)
3 function calls in 0.191 CPU seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.006 0.006 0.191 0.191 <string>:1(<module>)
1 0.185 0.185 0.185 0.185 myprimes.py:3(rwh_primes1)
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
3 function calls in 0.192 CPU seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.006 0.006 0.192 0.192 <string>:1(<module>)
1 0.186 0.186 0.186 0.186 myprimes.py:12(primes)
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
有趣的是,通过将限制增加到10 ** 8并将时序装饰器放到删除分析的函数中:
rwh_primes1 took 23.670 s
primes took 22.792 s
primesieve took 10.850 s
有趣的是,如果您不生成素数列表但是返回筛子本身的时间大约是数字列表版本的一半。
答案 0 :(得分:1)
您可以进行车轮优化。 2和3的倍数不是素数,所以根本不存储它们。然后你可以从5开始,通过以2,4,2,4,2,4等步长递增来跳过2和3的倍数。
下面是它的C ++代码。希望这会有所帮助。
void sieve23()
{
int lim=sqrt(MAX);
for(int i=5,bit1=0;i<=lim;i+=(bit1?4:2),bit1^=1)
{
if(!isComp[i/3])
{
for(int j=i,bit2=1;;)
{
j+=(bit2?4*i:2*i);
bit2=!bit2;
if(j>=MAX)break;
isComp[j/3]=1;
}
}
}
}
答案 1 :(得分:0)
如果您可能决定使用C ++来提高速度,我将Python筛选移植到C ++。完整的讨论可以在这里找到:Porting optimized Sieve of Eratosthenes from Python to C++。
在Intel Q6600,Ubuntu 10.10上,使用g++ -O3
编译并且N = 100000000,这需要415 ms。
#include <vector>
#include <boost/dynamic_bitset.hpp>
// http://vault.embedded.com/98/9802fe2.htm - integer square root
unsigned short isqrt(unsigned long a) {
unsigned long rem = 0;
unsigned long root = 0;
for (short i = 0; i < 16; i++) {
root <<= 1;
rem = ((rem << 2) + (a >> 30));
a <<= 2;
root++;
if (root <= rem) {
rem -= root;
root++;
} else root--;
}
return static_cast<unsigned short> (root >> 1);
}
// https://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188
// https://stackoverflow.com/questions/5293238/porting-optimized-sieve-of-eratosthenes-from-python-to-c/5293492
template <class T>
void primesbelow(T N, std::vector<T> &primes) {
T i, j, k, sievemax, sievemaxroot;
sievemax = N/3;
if ((N % 6) == 2) sievemax++;
sievemaxroot = isqrt(N)/3;
boost::dynamic_bitset<> sieve(sievemax);
sieve.set();
sieve[0] = 0;
for (i = 0; i <= sievemaxroot; i++) {
if (sieve[i]) {
k = (3*i + 1) | 1;
for (j = k*k/3; j < sievemax; j += 2*k) sieve[j] = 0;
for (j = (k*k+4*k-2*k*(i&1))/3; j < sievemax; j += 2*k) sieve[j] = 0;
}
}
primes.push_back(2);
primes.push_back(3);
for (i = 0; i < sievemax; i++) {
if (sieve[i]) primes.push_back((3*i+1)|1);
}
}