Question

普遍认为n 不同符号的列表有n！排列。然而，当符号不明显时，在数学和其他地方最常见的惯例似乎只计算不同的排列。因此，列表[1, 1, 2]的排列通常被认为是 [1, 1, 2], [1, 2, 1], [2, 1, 1]。实际上，以下C ++代码正好打印出这三个：

int a[] = {1, 1, 2};
do {
    cout<<a[0]<<" "<<a[1]<<" "<<a[2]<<endl;
} while(next_permutation(a,a+3));

另一方面，Python的itertools.permutations似乎打印了别的东西：

import itertools
for a in itertools.permutations([1, 1, 2]):
    print a

打印

(1, 1, 2)
(1, 2, 1)
(1, 1, 2)
(1, 2, 1)
(2, 1, 1)
(2, 1, 1)

正如用户Artsiom Rudzenka在回答中指出的那样，Python documentation这样说：

元素根据其位置而不是其价值被视为唯一。

我的问题：为什么做出这个设计决定？

似乎遵循通常的惯例会给出更有用的结果（事实上它通常正是我想要的）......或者是否存在一些我缺少的Python行为应用？

[或者是一些实施问题？ next_permutation中的算法 - 例如在StackOverflow here (by me)和shown here to be O(1) amortised上解释 - 在Python中似乎是高效且可实现的，但是Python做得更高效，因为它不保证字典顺序基于价值？如果是这样，那么效率的提高是否值得呢？]

Answer 1

我不能代表itertools.permutations（Raymond Hettinger）的设计师，但在我看来，有几点赞成设计：

首先，如果您使用next_permutation - 样式方法，那么您将被限制为传递支持线性排序的对象。而itertools.permutations提供任何类型对象的排列。想象一下这会有多烦人：

>>> list(itertools.permutations([1+2j, 1-2j, 2+j, 2-j]))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: no ordering relation is defined for complex numbers

其次，通过不测试对象上的相等性，itertools.permutations避免支付在通常情况下调用__eq__方法的成本，而不需要它。

基本上，itertools.permutations可靠且廉价地解决了常见问题。肯定有一个论点要求itertools应该提供避免重复排列的函数，但这样的函数应该是itertools.permutations的补充，而不是它。为什么不编写这样的函数并提交补丁呢？

Answer 2

我接受Gareth Rees的答案是最吸引人的解释（缺少Python库设计者的答案），即Python的itertools.permutations不会比较元素的值。想想看，这就是问题所在，但我现在看到它如何被视为一种优势，取决于通常使用itertools.permutations的内容。

为了完整起见，我比较了三种生成所有不同排列的方法。方法1，内存和时间非常低效，但需要最少的新代码，是包装Python的itertools.permutations，如在zeekay的答案中。方法2是来自this blog post的C ++ next_permutation的基于生成器的版本。我写的方法3更接近C++'s next_permutation algorithm;它就地修改了列表（我没有太过笼统）。

def next_permutationS(l):
    n = len(l)
    #Step 1: Find tail
    last = n-1 #tail is from `last` to end
    while last>0:
        if l[last-1] < l[last]: break
        last -= 1
    #Step 2: Increase the number just before tail
    if last>0:
        small = l[last-1]
        big = n-1
        while l[big] <= small: big -= 1
        l[last-1], l[big] = l[big], small
    #Step 3: Reverse tail
    i = last
    j = n-1
    while i < j:
        l[i], l[j] = l[j], l[i]
        i += 1
        j -= 1
    return last>0

以下是一些结果。我现在更加尊重Python的内置函数：当元素全部（或几乎全部）不同时，它的速度大约是其他方法的三到四倍。当然，当有许多重复元素时，使用它是一个可怕的想法。

Some results ("us" means microseconds):

l                                       m_itertoolsp  m_nextperm_b  m_nextperm_s
[1, 1, 2]                               5.98 us       12.3 us       7.54 us
[1, 2, 3, 4, 5, 6]                      0.63 ms       2.69 ms       1.77 ms
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]         6.93 s        13.68 s       8.75 s

[1, 2, 3, 4, 6, 6, 6]                   3.12 ms       3.34 ms       2.19 ms
[1, 2, 2, 2, 2, 3, 3, 3, 3, 3]          2400 ms       5.87 ms       3.63 ms
[1, 1, 1, 1, 1, 1, 1, 1, 1, 2]          2320000 us    89.9 us       51.5 us
[1, 1, 2, 2, 3, 3, 4, 4, 4, 4, 4, 4]    429000 ms     361 ms        228 ms

如果有人想探索，则代码为here。

Answer 3

通过包装可能影响决策的itertools.permutations来获取您喜欢的行为相当容易。如文档中所述，itertools被设计为用于构建自己的迭代器的构建块/工具的集合。

def unique(iterable):
    seen = set()
    for x in iterable:
        if x in seen:
            continue
        seen.add(x)
        yield x

for a in unique(permutations([1, 1, 2])):
    print a

(1, 1, 2)
(1, 2, 1)
(2, 1, 1)

但是，正如评论中所指出的，这可能不如您所希望的那样高效：

>>> %timeit iterate(permutations([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2]))
1 loops, best of 3: 4.27 s per loop

>>> %timeit iterate(unique(permutations([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2])))
1 loops, best of 3: 13.2 s per loop

如果有足够的兴趣，可以将itertools.permutations的新函数或可选参数添加到itertools，以更有效地生成没有重复的排列。

Answer 4

我还惊讶地发现itertools没有针对更直观的唯一排列概念的功能。对于任何严肃的应用来说，生成重复排列只是为了选择它们中唯一的排列是不可能的。

我编写了自己的迭代生成器函数，其行为类似于itertools.permutations但不返回重复项。仅考虑原始列表的排列，可以使用标准itertools库创建子列表。

def unique_permutations(t):
    lt = list(t)
    lnt = len(lt)
    if lnt == 1:
        yield lt
    st = set(t)
    for d in st:
        lt.remove(d)
        for perm in unique_permutations(lt):
            yield [d]+perm
        lt.append(d)

Answer 5

也许我错了，但似乎原因在'Elements are treated as unique based on their position, not on their value. So if the input elements are unique, there will be no repeat values in each permutation.' 你已经指定了（1,1,2）并且从你的角度来看1在0索引处和1在1索引处是相同的 - 但是这不是因为排列python实现使用索引而不是值。

因此，如果我们看一下默认的python排列实现，我们会看到它使用索引：

def permutations(iterable, r=None):
    pool = tuple(iterable)
    n = len(pool)
    r = n if r is None else r
    for indices in product(range(n), repeat=r):
        if len(set(indices)) == r:
            yield tuple(pool[i] for i in indices)

例如，如果您将输入更改为[1,2,3]，您将获得正确的排列（[（1,2,3），（1,3,2），（2,1,3），（ 2,3,1），（3,1,2），（3,2,1）]）因为这些值是唯一的。

Answer 6

回顾这个老问题，现在最容易做的就是使用more_itertools.distinct_permutations。

为什么Python的itertools.permutations包含重复项？（当原始列表有重复时）

6 个答案:

为什么Python的itertools.permutations包含重复项？ （当原始列表有重复时）

6 个答案:

为什么Python的itertools.permutations包含重复项？（当原始列表有重复时）