Question

例如，列表to_be包含："a"中的3个，"b"中的4个，"c"中的3个，"d"中的5个... < / p>

to_be = ["a", "a", "a", "b", "b", "b", "b", "c", "c", "c", "d", "d", "d", "d", "d", ...]

现在我希望它像这样：

done = ["a", "b", "c", "d", ... , "a", "b", "c", "d", ... , "b", "d", ...] (notice: some items are more than others as in amounts, but they need to be still in a pre-defined order, alphabetically for example)

最快的方法是什么？

Answer 1

假设我理解你想要的东西，可以通过合并itertools.zip_longest，itertools.groupby和itertools.chain.from_iterable()来相对轻松地完成：

我们首先将这些项目分组（"a" s，"b"等等），我们将它们压缩起来，按照您想要的顺序（每组一个）），使用chain生成单个列表，然后删除压缩引入的None值。

>>> [item for item in itertools.chain.from_iterable(itertools.zip_longest(*[list(x) for _, x in itertools.groupby(to_be)])) if item]
['a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'b', 'd', 'd']

您可能希望将某些list comprehensions分开以使其更具可读性，但是：

>>> groups = itertools.zip_longest(*[list(x) for _, x in itertools.groupby(to_be)])
>>> [item for item in itertools.chain.from_iterable(groups) if item]
['a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'b', 'd', 'd']

（给定版本为3.x，对于2.x，您需要izip_longest()。）

与往常一样，如果您期望空字符串，0等等...那么您将需要if item is not None，如果您需要保持None值，请创建一个标记对象，检查身份。

您还可以使用文档中提供的the roundrobin() recipe作为压缩的替代方法，这使其简单如下：

>>> list(roundrobin(*[list(x) for _, x in itertools.groupby(to_be)]))
['a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'b', 'd', 'd']

作为最后一点，观察者可能会注意到我从groupby()生成器制作列表，这看起来很浪费，原因来自the docs：

返回的组本身就是一个共享底层的迭代器可以使用groupby（）进行迭代。因为源是共享的，所以 groupby（）对象是高级的，前一个组不再可见。因此，如果以后需要该数据，则应将其存储为列表。

Answer 2

to_be = ["a", "a", "a", "b", "b", "b", "b", "c", "c", "c", "d", "d", "d", "d", "d"]
counts = collections.Counter(to_be)
answer = []
while counts:
    answer.extend(sorted(counts))
    for k in counts:
        counts[k] -= 1
    counts = {k:v for k,v in counts.iteritems() if v>0}

现在，answer看起来像这样：

['a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'b', 'd', 'd']

希望这有帮助

Answer 3

我不确定这是否最快，但这是我的抨击：

>>> d = defaultdict(int)
>>> def sort_key(a):
...     d[a] += 1
...     return d[a],a
...

>>> sorted(to_be,key=sort_key)
['a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'b', 'd', 'd']

包含在一个函数中：

def weird_sort(x):
    d = defaultdict(int)
    def sort_key(a):
        d[a] += 1
        return (d[a],a)
    return sorted(x,key=sort_key)

当然，这要求你的iterable中的元素是可以删除的。

Answer 4

比Lattyware的优雅一点：

import collections
def rearrange(l):
    counts = collections.Counter(l)
    output = []
    while (sum([v for k,v in counts.items()]) > 0):
        output.extend(sorted([k for k, v in counts.items() if v > 0))
        for k in counts:
            counts[k] = counts[k] - 1 if counts[k] > 0 else 0
    return counts

Answer 5

“手动和状态机械”这样做应该更有效率 - 但是对于相对较小的列表（<5000），你应该没有任何问题 Python好东西这样做：

to_be = ["a", "a", "a", "b", "b", "b", "b", "c", "c", "c", "d", "d", "d", "d", "d","e", "e"]


def do_it(lst):
    lst = lst[:]
    result = []

    while True:
        group = set(lst)
        result.extend(sorted(group))
        for element in group:
            del lst[lst.index(element)]
        if not lst:
            break
    return result

done = do_it(to_be)

上述功能的“大O”复杂度应该非常大。我没有事情想知道它。

如何重新安排这样的列表（python）？

5 个答案: