使用python对列表进行分组和排序

时间:2014-07-10 01:24:08

标签: python list group-by

我正在尝试使用一个列表对另一个列表进行排序并同时保持它们同步:

keys = [x,x,x,y,y,x,x,z,z,z,x,x]
data = [1,2,3,4,5,6,7,8,9,10,11,12]

我想使用密钥列表将数据列表组织到相同密钥的子组中。

result = [[1,2,3,6,7,11,12],[4,5,],[8,9,10]]

我还想确保列表在每个子组中排序。

到目前为止,我能够正确地对其进行整理:

group = []

data = sorted(zip(data, keys), key=lambda x: (x[1]))
for i, grp in groupby(data, lambda x: x[1]):
    sub_group = [], []
    for j in grp:
        sub_group.append(j[1])
    group.extend(sub_group)

我还缺少什么?谢谢!

4 个答案:

答案 0 :(得分:2)

你差不多完成了。试试这段代码

group = []
data = sorted(zip(data, keys), key=lambda x: (x[1]))
for i, grp in groupby(data, lambda x: x[1]):
    group.append([item[0] for item in grp])

grp(data, key)对,因此您需要从data

对中选择[item[0] for item in grp]

已更新

这段代码我用来回答。

from itertools import groupby

x, y, z = range(3)
keys = [x,x,x,y,y,x,x,z,z,z,x,x]
data = [1,2,3,4,5,6,7,8,9,10,11,12]

group = []
data = sorted(zip(data, keys), key=lambda x: (x[1]))
for i, grp in groupby(data, lambda x: x[1]):
    group.append([item[0] for item in grp])

print group

答案 1 :(得分:1)

如果您使用collections.OrderedDict及其setdefault方法,则会更简单:

from collections import OrderedDict

# To demonstrate, I made the keys into strings
keys = ['x', 'x', 'x', 'y', 'y', 'x', 'x', 'z', 'z', 'z', 'x', 'x']
data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]

dct = OrderedDict()
for key,val in zip(keys, data):
    dct.setdefault(key, []).append(val)

print(dct)
print(list(dct.values()))

输出:

OrderedDict([('x', [1, 2, 3, 6, 7, 11, 12]), ('y', [4, 5]), ('z', [8, 9, 10])])
[[1, 2, 3, 6, 7, 11, 12], [4, 5], [8, 9, 10]]

答案 2 :(得分:1)

OrderedDict可能是更好的选择,但......

import itertools as it
from operator import itemgetter
x = 1
y = 2
z = 3
keys = [x,x,x,y,y,x,x,z,z,z,x,x]
data = [1,2,3,4,5,6,7,8,9,10,11,12]
key = itemgetter(1)
value = itemgetter(0)

data = sorted(zip(data, keys), key=key)
print [map(value, grp) for k, grp in it.groupby(data, key)]

答案 3 :(得分:0)

请注意,OrderedDict按其插入顺序对键进行排序,而不是在事后对键进行排序。如果'键'列表未按要求的顺序排列,您将无法获得预期的结果。

我的解决方案:

from collections import defaultdict

keys = ['x', 'x', 'x', 'y', 'y', 'x', 'x', 'z', 'z', 'z', 'x', 'x']
data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]

# group 'data' values by 'key'
grouped = defaultdict(list)
for key, data in zip(keys, data):
    grouped[key].append(data)

# construct the final list of subgroups
# contents of each subgroup must be sorted
# also sorting the keys so that the 'x' subgroup comes before the 'y' subgroup etc
grouped_and_ordered = [sorted(grouped[key]) for key in sorted(grouped.keys())]