代码花费了太多时间

时间:2013-12-26 05:22:34

标签: python

我编写代码以在用户输入后排列数字。排序要求相邻数字的总和为素数。直到10,因为输入代码工作正常。如果我超越了那个系统就会挂起。请让我知道优化它的步骤

ex input 8
答案应该是:(1,2,3,4,7,6,5,8)
代码如下....

import itertools

x = raw_input("please enter a number")
range_x = range(int(x)+1)
del range_x[0]
result = list(itertools.permutations(range_x))
def prime(x):
    for i in xrange(1,x,2):
        if i == 1:
            i = i+1
        if x%i==0 and i < x :
            return False
    else:
        return True

def is_prime(a):
    for i in xrange(len(a)):
        print a
        if i < len(a)-1:
            if prime(a[i]+a[i+1]):
                pass
            else:
                return False
        else:
            return True


for i in xrange(len(result)):
    if i < len(result)-1:
        if is_prime(result[i]):
            print 'result is:'
            print result[i]
            break
    else:
        print 'result is'
        print result[i-1]

4 个答案:

答案 0 :(得分:4)

此答案基于@Tim Peters' suggestion about Hamiltonian paths

有许多可能的解决方案。为了避免中间解决方案的过多内存消耗,可以生成随机路径。它还允许轻松利用多个CPU(每个cpu并行生成自己的路径)。

import multiprocessing as mp
import sys

def main():
    number = int(sys.argv[1])

    # directed graph, vertices: 1..number (including ends)
    # there is an edge between i and j if (i+j) is prime
    vertices = range(1, number+1)
    G = {} # vertex -> adjacent vertices
    is_prime = sieve_of_eratosthenes(2*number+1)
    for i in vertices:
        G[i] = []
        for j in vertices:
            if is_prime[i + j]:
                G[i].append(j) # there is an edge from i to j in the graph

    # utilize multiple cpus
    q = mp.Queue()
    for _ in range(mp.cpu_count()):
        p = mp.Process(target=hamiltonian_random, args=[G, q])
        p.daemon = True # do not survive the main process
        p.start()
    print(q.get())

if __name__=="__main__":
    main()

其中Sieve of Eratosthenes是:

def sieve_of_eratosthenes(limit):
    is_prime = [True]*limit
    is_prime[0] = is_prime[1] = False # zero and one are not primes
    for n in range(int(limit**.5 + .5)):
        if is_prime[n]:
            for composite in range(n*n, limit, n):
                is_prime[composite] = False
    return is_prime

import random

def hamiltonian_random(graph, result_queue):
    """Build random paths until Hamiltonian path is found."""
    vertices = list(graph.keys())
    while True:
        # build random path
        path = [random.choice(vertices)] # start with a random vertice
        while True: # until path can be extended with a random adjacent vertex
            neighbours = graph[path[-1]]
            random.shuffle(neighbours)
            for adjacent_vertex in neighbours:
                if adjacent_vertex not in path:
                    path.append(adjacent_vertex)
                    break
            else: # can't extend path
                break

        # check whether it is hamiltonian
        if len(path) == len(vertices):
            assert set(path) == set(vertices)
            result_queue.put(path) # found hamiltonian path
            return

实施例

$ python order-adjacent-prime-sum.py 20

输出

[19, 18, 13, 10, 1, 4, 9, 14, 5, 6, 17, 2, 15, 16, 7, 12, 11, 8, 3, 20]

输出是满足条件的随机序列:

  • 它是1到20(包括)
  • 范围内的排列
  • 相邻数字之和为素数

时间表现

平均需要大约10秒才能获得n = 900的结果并将时间外推为指数函数,20需要大约n = 1000秒:

time performance (no set solution)

使用以下代码生成图像:

import numpy as np
figname = 'hamiltonian_random_noset-noseq-900-900'
Ns, Ts = np.loadtxt(figname+'.xy', unpack=True)

# use polyfit to fit the data
# y = c*a**n
# log y = log (c * a ** n)
# log Ts = log c + Ns * log a
coeffs = np.polyfit(Ns, np.log2(Ts), deg=1)
poly = np.poly1d(coeffs, variable='Ns')

# use curve_fit to fit the data
from scipy.optimize import curve_fit
def func(x, a, c):
    return c*a**x
popt, pcov = curve_fit(func, Ns, Ts)
aa, cc = popt
a, c = 2**coeffs

# plot it
import matplotlib.pyplot as plt
plt.figure()
plt.plot(Ns, np.log2(Ts), 'ko', label='time measurements')
plt.plot(Ns, np.polyval(poly, Ns), 'r-',
         label=r'$time = %.2g\times %.4g^N$' % (c, a))
plt.plot(Ns, np.log2(func(Ns, *popt)), 'b-',
         label=r'$time = %.2g\times %.4g^N$' % (cc, aa))
plt.xlabel('N')
plt.ylabel('log2(time in seconds)')
plt.legend(loc='upper left')
plt.show()

适合的值:

>>> c*a**np.array([900, 1000])
array([ 11.37200806,  21.56029156])
>>> func([900, 1000], *popt)
array([ 14.1521409 ,  22.62916398])

答案 1 :(得分:4)

对于后代;-),这里还有一个基于找到汉密尔顿主义的道路。这是Python3代码。如上所述,它在找到第一条路径时停止,但可以轻松更改以生成所有路径。在我的方框中,它会在1到900之间找到所有n的解决方案,总共大约一分钟。对于大于900的n,它超过了最大递归深度。

素数发生器(psieve())对于这个特殊问题来说是一种极大的过度杀伤力,但是我把它弄得很方便而且不想写另一个; - )

路径查找器(ham())是一种递归回溯搜索,使用经常(但并不总是)非常有效的排序启发式:与最后一个顶点相邻的所有顶点到目前为止的路径,首先看看剩余出口最少的那些。例如,这是解决Knights Tour问题的“通常”启发式算法。在这种情况下,它经常会找到一个根本不需要回溯的巡回演出。你的问题看起来有点困难。

def psieve():
    import itertools
    yield from (2, 3, 5, 7)
    D = {}
    ps = psieve()
    next(ps)
    p = next(ps)
    assert p == 3
    psq = p*p
    for i in itertools.count(9, 2):
        if i in D:      # composite
            step = D.pop(i)
        elif i < psq:   # prime
            yield i
            continue
        else:           # composite, = p*p
            assert i == psq
            step = 2*p
            p = next(ps)
            psq = p*p
        i += step
        while i in D:
            i += step
        D[i] = step

def build_graph(n):
    primes = set()
    for p in psieve():
        if p > 2*n:
            break
        else:
            primes.add(p)

    np1 = n+1
    adj = [set() for i in range(np1)]
    for i in range(1, np1):
        for j in range(i+1, np1):
            if i+j in primes:
                adj[i].add(j)
                adj[j].add(i)
    return set(range(1, np1)), adj

def ham(nodes, adj):
    class EarlyExit(Exception):
        pass

    def inner(index):
        if index == n:
            raise EarlyExit
        avail = adj[result[index-1]] if index else nodes
        for i in sorted(avail, key=lambda j: len(adj[j])):
            # Remove vertex i from the graph.  If this isolates
            # more than 1 vertex, no path is possible.
            result[index] = i
            nodes.remove(i)
            nisolated = 0
            for j in adj[i]:
                adj[j].remove(i)
                if not adj[j]:
                    nisolated += 1
                    if nisolated > 1:
                        break
            if nisolated < 2:
                inner(index + 1)
            nodes.add(i)
            for j in adj[i]:
                adj[j].add(i)

    n = len(nodes)
    result = [None] * n
    try:
        inner(0)
    except EarlyExit:
        return result

def solve(n):
    nodes, adj = build_graph(n)
    return ham(nodes, adj)

答案 2 :(得分:3)

动态编程,救援:

def is_prime(n):
    return all(n % i != 0 for i in range(2, n))

def order(numbers, current=[]):
    if not numbers:
        return current

    for i, n in enumerate(numbers):
        if current and not is_prime(n + current[-1]):
            continue

        result = order(numbers[:i] + numbers[i + 1:], current + [n])

        if result:
            return result

    return False

result = order(range(500))

for i in range(len(result) - 1):
    assert is_prime(result[i] + result[i + 1])

您可以通过增加最大递归深度来强制它适用于更大的列表。

答案 3 :(得分:3)

这是我对解决方案的看法。蒂姆彼得斯指出,这是一个汉密尔顿路径问题。 所以第一步是以某种形式生成图形。

在这种情况下,第0步生成素数。我打算用筛子,但无论什么样的质量测试都没关系。我们需要素数到2 * n,因为这是任何两个数字可以求和的最大值。

m = 8
n = m + 1 # Just so I don't have to worry about zero indexes and random +/- 1's
primelen = 2 * m
prime = [True] * primelen
prime[0] = prime[1] = False
for i in range(4, primelen, 2):
    prime[i] = False
for i in range(3, primelen, 2):
    if not prime[i]:
        continue
    for j in range(i * i, primelen, i):
        prime[j] = False

好的,现在我们可以使用prime[i]测试素数。现在很容易使图形边缘化。如果我有一个数字i,接下来会有什么数字。我还会利用i和j具有相反奇偶性的事实。

pairs = [set(j for j in range(i%2+1, n, 2) if prime[i+j])
         for i in range(n)]

所以这里pairs[i]是设置对象,其元素是整数j,因此i+j是素数。

现在我们需要走图表。这真的是耗时的部分,所有进一步的优化都将在这里完成。

chains = [
    ([], set(range(1, n))
]
当我们走路时,

chains将跟踪有效路径。元组中的第一个元素是你的结果。第二个元素是所有未使用的数字或未访问的节点。我们的想法是将一条链从队列中取出,沿着路径向下走,然后将它放回去。

while chains:
    chain, unused = chains.pop()

    if not chain:
        # we haven't even started, all unused are valid
        valid_next = unused
    else:
        # We need numbers that are both unused and paired with the last node
        # Using sets makes this easy
        valid_next = unused & pairs[chains[-1]]

    for num in valid_next:
        # Take a step to the new node and add the new path back to chains
        # Reminder, its important not to mutate anything here, always make new objs
        newchain  = chain + [num]
        newunused = unused - set([num])
        chains.append( (newchain, newunused) )

        # are we done?
        if not newunused:
            print newchain
            chains = False

请注意,如果没有有效的下一步,则删除该路径而不进行替换。

这实际上是内存效率低下,但在合理的时间内运行。最大的性能瓶颈在于走图表,因此下一个优化将是在智能位置弹出和插入路径,以优先考虑最可能的路径。在这种情况下,为您的链使用collections.deque或不同的容器可能会有所帮助。

修改

以下是如何实现路径优先级的示例。我们将为每个路径分配一个分数,并将chains列表按此分数排序。举一个简单的例子,我将建议包含“难以使用”节点的路径值得更多。对于路径中的每个步骤,分数将增加n - len(valid_next)。修改后的代码看起来像这样。

import bisect
chains = ...
chains_score = [0]
while chains:
     chain, unused = chains.pop()
     score = chains_score.pop()
     ...

     for num in valid_next:
          newchain = chain + [num]
          newunused = unused - set([num])
          newscore = score + n - len(valid_next)
          index = bisect.bisect(chains_score, newscore)
          chains.insert(index, (newchain, newunused))
          chains_score.insert(index, newscore)

请记住,插入是O(n),因此添加此内容的开销可能相当大。值得对您的得分算法进行一些分析,以保持队列长度len(chains)可管理。