如何重新安排这样的列表(python)?

时间:2012-11-15 03:55:56

标签: python algorithm list

例如,列表to_be包含:"a"中的3个,"b"中的4个,"c"中的3个,"d"中的5个... < / p>

to_be = ["a", "a", "a", "b", "b", "b", "b", "c", "c", "c", "d", "d", "d", "d", "d", ...]

现在我希望它像这样:

done = ["a", "b", "c", "d", ... , "a", "b", "c", "d", ... , "b", "d", ...] (notice: some items are more than others as in amounts, but they need to be still in a pre-defined order, alphabetically for example)

最快的方法是什么?

5 个答案:

答案 0 :(得分:12)

假设我理解你想要的东西,可以通过合并itertools.zip_longestitertools.groupbyitertools.chain.from_iterable()来相对轻松地完成:

我们首先将这些项目分组("a" s,"b"等等),我们将它们压缩起来,按照您想要的顺序(每组一个) ),使用chain生成单个列表,然后删除压缩引入的None值。

>>> [item for item in itertools.chain.from_iterable(itertools.zip_longest(*[list(x) for _, x in itertools.groupby(to_be)])) if item]
['a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'b', 'd', 'd']

您可能希望将某些list comprehensions分开以使其更具可读性,但是:

>>> groups = itertools.zip_longest(*[list(x) for _, x in itertools.groupby(to_be)])
>>> [item for item in itertools.chain.from_iterable(groups) if item]
['a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'b', 'd', 'd']

(给定版本为3.x,对于2.x,您需要izip_longest()。)

与往常一样,如果您期望空字符串,0等等...那么您将需要if item is not None,如果您需要保持None值,请创建一个标记对象,检查身份。

您还可以使用文档中提供的the roundrobin() recipe作为压缩的替代方法,这使其简单如下:

>>> list(roundrobin(*[list(x) for _, x in itertools.groupby(to_be)]))
['a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'b', 'd', 'd']

作为最后一点,观察者可能会注意到我从groupby()生成器制作列表,这看起来很浪费,原因来自the docs

  

返回的组本身就是一个共享底层的迭代器   可以使用groupby()进行迭代。因为源是共享的,所以   groupby()对象是高级的,前一个组不再可见。   因此,如果以后需要该数据,则应将其存储为列表。

答案 1 :(得分:2)

to_be = ["a", "a", "a", "b", "b", "b", "b", "c", "c", "c", "d", "d", "d", "d", "d"]
counts = collections.Counter(to_be)
answer = []
while counts:
    answer.extend(sorted(counts))
    for k in counts:
        counts[k] -= 1
    counts = {k:v for k,v in counts.iteritems() if v>0}

现在,answer看起来像这样:

['a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'b', 'd', 'd']

希望这有帮助

答案 2 :(得分:1)

我不确定这是否最快,但这是我的抨击:

>>> d = defaultdict(int)
>>> def sort_key(a):
...     d[a] += 1
...     return d[a],a
...

>>> sorted(to_be,key=sort_key)
['a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'b', 'd', 'd']

包含在一个函数中:

def weird_sort(x):
    d = defaultdict(int)
    def sort_key(a):
        d[a] += 1
        return (d[a],a)
    return sorted(x,key=sort_key)

当然,这要求你的iterable中的元素是可以删除的。

答案 3 :(得分:0)

比Lattyware的优雅一点:

import collections
def rearrange(l):
    counts = collections.Counter(l)
    output = []
    while (sum([v for k,v in counts.items()]) > 0):
        output.extend(sorted([k for k, v in counts.items() if v > 0))
        for k in counts:
            counts[k] = counts[k] - 1 if counts[k] > 0 else 0
    return counts

答案 4 :(得分:0)

“手动和状态机械”这样做应该更有效率 - 但是对于相对较小的列表(<5000),你应该没有任何问题 Python好东西这样做:

to_be = ["a", "a", "a", "b", "b", "b", "b", "c", "c", "c", "d", "d", "d", "d", "d","e", "e"]


def do_it(lst):
    lst = lst[:]
    result = []

    while True:
        group = set(lst)
        result.extend(sorted(group))
        for element in group:
            del lst[lst.index(element)]
        if not lst:
            break
    return result

done = do_it(to_be)

上述功能的“大O”复杂度应该非常大。我没有事情想知道它。