Question

我编写了一个函数，该函数返回带有重复的字符串的所有可能“产品”的列表（用于拼写检查程序）：

def productRepeats(string):
    comboList = []

    for item in [p for p in product(string, repeat = len(list(string)))]:
        if item not in comboList:
            comboList.append("".join(item))

    return list(set(comboList))

因此，当输入print(productRepeats("PIP"))时，他的输出是（我不在乎顺序是什么）：

['PII', 'IIP', 'PPI', 'IPI', 'IPP', 'PPP', 'PIP', 'III']

但是，如果我尝试使用大于5位（PIIIIP）的数字，即使只有64种方式，输出也需要大约30秒

有什么办法可以加快速度，例如，获取字符串'GERAWUHGP'的列表需要花费半个小时以上的时间？

Answer 1

消除重复之前调用`product()`

product(seq, repeat=len(seq))仅在seq包含任何重复元素的情况下才会产生重复结果；例如，product('ABC', repeat=3)将没有重复项，但是product('ABA', repeat=3)将具有一些重复项，因为A将被多次选择（而且使用'ABA'的事实会加剧这种情况作为论据的三倍）。首先从string中过滤出所有重复项，然后将结果传递给product，您将能够完全删除后product个重复项检查，因此您只需返回结果即可的product直接：

def productRepeats(string):
    return product(set(string), repeat=len(string))

Answer 2

您可以使用一些技巧：

使用列表推导或map进行迭代。
与@jwodder explains一样，使用set(string)以避免在以后的阶段检查重复项。

这是一个演示。我发现“ hello”的性能提高了约900倍：

from itertools import product

def productRepeats(string):
    comboList = []

    for item in [p for p in product(string, repeat = len(list(string)))]:
        if item not in comboList:
            comboList.append("".join(item))

    return list(set(comboList))

def productRepeats2(string):
    return list(map(''.join, product(set(string), repeat=len(string))))

assert set(productRepeats2('hello')) == set(productRepeats('hello'))

%timeit productRepeats('hello')   # 127 ms
%timeit productRepeats2('hello')  # 143 µs

加快使用Python输出带有重复的产品的功能

2 个答案:

消除重复之前调用`product()`

加快使用Python输出带有重复的产品的功能

2 个答案:

消除重复之前调用product()

消除重复之前调用`product()`