将元组中的各种大小的列表分组

时间:2018-08-13 17:02:43

标签: python python-3.x pandas

我想根据每个列表中的最后一个元素将所有列表分组为一个元组,并且还要计算最后一个元素出现的次数。但是我发现的挑战是元组中的所有列表都可以具有不同的大小。

例如输入

[['aa', 'b'], ['bb', 'c'], ['cc', 'b'], ['dd','ee','a'], ['ff', 'gg', 'hh', 'a']]

我正在尝试使输出为

('a', 2, [('dd','ee'),('ff', 'gg', 'hh')]), ( 'b', 2, [('aa'), ('cc')]), ( 'c', 1, [('bb')])

最后,我想继续并将其转换为panda-dataframe格式。如果有人可以帮助/指导,将不胜感激。

3 个答案:

答案 0 :(得分:1)

可读版本

mylist.sort(key=operator.itemgetter(-1)) # sort by last element

result = []
for k, g in itertools.groupby(mylist, key=operator.itemgetter(-1)):
    # remove last element from each sublist:
    g = [tuple(sublist[:-1]) for sublist in g]
    result.append((k, len(g), g))

答案 1 :(得分:0)

不导入库

list = [['aa', 'b'], ['bb', 'c'], ['cc', 'b'], ['dd','ee','a'], ['ff', 'gg', 'hh', 'a']]

instances = {}
for sublist in list:
    leading_elements, last_element = sublist[:-1], sublist[-1]
    instances.setdefault(last_element, [])
    instances[last_element].append(tuple(leading_elements))

result = tuple()
for key, val in instances.items():
    result += (key, len(val), val)

答案 2 :(得分:-1)

使用itertools.groupby

>>> from itertools import groupby
>>> l = [['aa', 'b'], ['bb', 'c'], ['cc', 'b'], ['dd','ee','a'], ['ff', 'gg', 'hh', 'a']]
>>>
>>> f = lambda sl: sl[-1]
>>> res = [(k, [tuple(sl[:-1]) for sl in v]) for k,v in groupby(sorted(l, key=f), f)]
>>> res = [(k, len(v), v) for k,v in res]
>>> print(res)
[('a', 2, [('dd', 'ee'), ('ff', 'gg', 'hh')]), ('b', 2, [('aa',), ('cc',)]), ('c', 1, [('bb',)])]