在尝试使用itertools.permutations时出现MemoryError,如何使用更少的内存?

时间:2015-07-08 20:59:01

标签: python algorithm python-3.x permutation itertools

我从包含如此随机字符串的文本文档中加载,并且我试图打印该字符串中字符的每个可能的排列。

如果记事本包含例如:

123
abc

我希望我的输出

123,132,213,231,312,321
abc,acb,bac,bca,cab,cba

文本文件包含一些非常大的字符串,所以我可以看到为什么我得到这个MemoryError。

我第一次尝试使用它:

import sys
import itertools
import math

def organize(to_print):
    number_list = []
    upper_list = []
    lower_list = []
    for x in range(0,len(to_print)):
        if str(to_print[x]).isdigit() is True:
            number_list.append(to_print[x])
        elif to_print[x].isupper() is True:
            upper_list.append(to_print[x])
        else:
            lower_list.append(to_print[x])
    master_list = number_list + upper_list + lower_list
    return master_list

number = open(*file_dir*, 'r').readlines()

factorial = math.factorial(len(number))
complete_series = ''

for x in range(0,factorial):
    complete_string = ''.join((list(itertools.permutations(organize(number)))[x]))

    complete_series += complete_string+','
edit_series = complete_series[:-1]
print(edit_series)

def organize的原因是,如果我有一个字符串1aB,我需要在开始排列之前按数字,大写,小写预先排序。

我在这里得到了内存错误:complete_string = ''.join((list(itertools.permutations(organize(number)))[x]))所以我最初的尝试是将它从for循环中删除。

我的第二次尝试就是:

import sys
import itertools
import math

def organize(to_print):
    number_list = []
    upper_list = []
    lower_list = []
    for x in range(0,len(to_print)):
        if str(to_print[x]).isdigit() is True:
            number_list.append(to_print[x])
        elif to_print[x].isupper() is True:
            upper_list.append(to_print[x])
        else:
            lower_list.append(to_print[x])
    master_list = number_list + upper_list + lower_list
    return master_list

number = open(*file_dir*, 'r').readlines()

factorial = math.factorial(len(number))
complete_series = ''
the_permutation = list(itertools.permutations(organize(number)))

for x in range(0,factorial):
    complete_string = ''.join((the_permutation[x]))

    complete_series += complete_string+','
edit_series = complete_series[:-1]
print(edit_series)

但我仍然收到内存错误。我不一定需要或直接得到答案,因为这是一种很好的学习方法,可以减少我的低效率,所以提示正确的方向会很好。

添加了第三次尝试:

import sys
import itertools
import math

def organize(to_print):
    number_list = []
    upper_list = []
    lower_list = []
    for x in range(0,len(to_print)):
        if str(to_print[x]).isdigit() is True:
            number_list.append(to_print[x])
        elif to_print[x].isupper() is True:
            upper_list.append(to_print[x])
        else:
            lower_list.append(to_print[x])
    master_list = number_list + upper_list + lower_list
    return master_list

number = open(*file_dir*, 'r').readlines()

factorial = math.factorial(len(number))
complete_series = ''
the_permutation = itertools.permutations(organize(number))
for x in itertools.islice(the_permutation,factorial):
    complete_string = ''.join(next(the_permutation))
    complete_series += complete_string+','
edit_series = complete_series[:-1]
print(edit_series)

2 个答案:

答案 0 :(得分:3)

不要打电话给列表,只是迭代排列:

the_permutation = itertools.permutations(organize(number))

for x in the_permutation:
    complete_string = ''.join(the_permutation)

list(itertools.permutations(organize(number)))将所有排列存储在内存中,然后将所有排列存储在循环中的字符串中,即使使用此方法,也无法保证您能够存储所有数据,具体取决于数据量在the_permutation

如果你只想要一定数量的排列,你可以在下面的排列对象中调用:

the_permutation = itertools.permutations(organize(number))
for x in range(factorial):
    complete_string = ''.join(next(the_permutation))

或者使用itertools.islice:

for x in itertools.islice(the_permutation,factorial):
    complete_string = ''.join(next(the_permutation))

答案 1 :(得分:0)

请记住,阶乘法的增长速度非常快

enter image description here

...所以即使对于一段中等长度的字符串,排列的数量也是巨大的。对于12个字母,其约为4.8亿。