python list comprehension:收集重复的列

时间:2013-09-02 05:40:57

标签: python

我有一个包含重复的第一个元素的列表的排序列表。 目前我正在迭代它以获得解决方案。

[['5th ave', 111, -30.00, 38.00],
['5th ave', 222, -30.00, 33.00],
['6th ave', 2224, -32.00, 34.90]]

我想要一个优雅的列表理解,将它转换为基于第一个元素的列表列表:

['5th ave', [[111, -30.00, 38.00] , [222, -30.00, 33.00]]

由于

3 个答案:

答案 0 :(得分:8)

看起来像collections.defaultdict的作业:

>>> from collections import defaultdict
>>> L = [['5th ave', 111, -30.00, 38.00],
... ['5th ave', 222, -30.00, 33.00],
... ['6th ave', 2224, -32.00, 34.90]]
>>> d = defaultdict(list)
>>> for sublist in L:
...     d[sublist[0]].append(sublist[1:])
... 
>>> print d.items()
[('5th ave', [[111, -30.0, 38.0], [222, -30.0, 33.0]]), ('6th ave', [[2224, -32.0, 34.9]])]

绝对没有理由让列表理解。仅仅因为它的线条较少并不意味着它更像是pythonic。

答案 1 :(得分:1)

data = [['5th ave', 111, -30.00, 38.00],
['5th ave', 222, -30.00, 33.00],
['6th ave', 2224, -32.00, 34.90]]

previous   = ""
listOfData = []
result     = []
for currentItem in data:
    if currentItem[0] != previous:
        if listOfData:
            result.append([previous, listOfData])
            listOfData = []
        previous = currentItem[0]
    listOfData.append(currentItem[1:])

if listOfData:
    result.append([previous, listOfData])

print result

<强>输出

[['5th ave', [[111, -30.0, 38.0], [222, -30.0, 33.0]]], ['6th ave', [[2224, -32.0, 34.9]]]]

这也维持了顺序。

修改

使用defaultdict我可以减少几行

from collections import defaultdict

data = [['5th ave', 111, -30.00, 38.00],
['5th ave', 222, -30.00, 33.00],
['6th ave', 2224, -32.00, 34.90]]

unique, Map = [], defaultdict(list)
for item in data:
    if item[0] not in unique: unique.append(item[0])
    Map[item[0]].append(item[1:])
print [(item, Map[item]) for item in unique]

这仍然维持秩序。

答案 2 :(得分:1)

collections.defaultdict真的是要走的路,但我觉得它可能会慢一点,这就是为什么我想出这个:

from itertools import imap

def RemDup(L):
    ListComp = {}
    for sublist in L:
        try: ListComp[sublist[0]].append(sublist[1:])
        except KeyError: ListComp[sublist[0]] = [sublist[1:]]
    return imap( list, ListComp.items() )

DupList = [['5th ave', 111, -30.00, 38.00],
['5th ave', 222, -30.00, 33.00],
['6th ave', 2224, -32.00, 34.90]]

print [ uniq for uniq in RemDup(DupList) ]