Question

我有这个脚本来计算文件中给定字符串的所有排列：

import sys
from itertools import permutations


def find_permutations(word):
    perms = [''.join(p) for p in permutations(word)]
    return ','.join(perms)


if __name__ == '__main__':
    with open(sys.argv[1]) as data:
        for line in data.readlines():
            print(find_permutations(line.rstrip()))

但是我遇到了Memory error，其中包含以下字符串American Pie。有没有办法可以获得此字符串的所有排列而不会遇到内存错误？或者真的有这么多，系统无法处理它？</ p>

确切错误：

Traceback (most recent call last):
  File "string_perms.py", line 13, in <module>
    print(find_permutations(line.rstrip()))
  File "string_perms.py", line 7, in find_permutations
    return ','.join(perms)
MemoryError

Answer 1

我认为解决这个问题的一个提示是避免一次将整个文件读入内存，因为你一次只需要一行：

   echo str_replace(".jpg",".jpg?w=500",$a);

然后您可以完全跳过在函数中创建with open(sys.argv[1]) as data: for line in data: for words in find_permutations(line.rstrip()): print(words)列表并返回生成器：

perms

另一方面，我不确定在您的排列中包含空格是否有意义，否则您可以def find_permutations(word): return (''.join(p) for p in permutations(word))出现所有具有空字符串的空格。

Answer 2

或者真的有这么多，系统无法处理它？</ p>

烨。

字符串会有math.factorial(len(word))个排列，一旦这些排列转换为字符串，然后与','结合成一个更大的字符串，结果就是一个字符串math.factorial(len(word)) * (len(word)+1) - 1 {{ 1}}字符长。即使每个字符只占用一个字节，字符串＆＃39; American Pie＆＃39;也就是6227020799字节（6 + gigs）。此外，您从list个排列开始，因此要加倍估算在函数调用期间分配的最小内存。

真正的问题是你需要做什么与每个perms？如果您只需要一次使用一个，请利用permutations是一个惰性生成器开始的事实。

with open(sys.argv[1]) as data:
    for line in data:
        for perm in permutations(line):
            permstr = ''.join(perm)
            print(permstr, end=',')
            #do what you need with perm here

如果您需要随意访问排列，您将不得不考虑更多涉及的内容，例如只计算块中的一些排列并将数据保存到磁盘以便以后访问。

字符串排列的内存错误

2 个答案: