我有一份清单清单:
countall = [[5, 0], [4, 1], [4, 1], [3, 2], [4, 1], [3, 2], [3, 2], [2, 3], [4, 1], [3, 2], [3, 2], [2, 3], [3, 2], [2, 3], [2, 3], [1, 4], [4, 1], [3, 2], [3, 2], [2, 3], [3, 2], [2, 3], [2, 3], [1, 4], [3, 2], [2, 3], [2, 3], [1, 4], [2, 3], [1, 4], [1, 4], [0, 5]]
我想在上面的列表中找到子列表的频率。
我试过使用itertools:
freq = [len(list(group)) for x in countall for key, group in groupby(x)]
然而,我得到了错误的结果:
[1, 2, 1, 1, 2, 1, 1, 2, 1, 1, 2, 1, 1, 2, 1, 1, 2, 1, 1, 2, 1, 1, 2, 1]
我的列表理解有什么问题?
答案 0 :(得分:4)
Groupby似乎处理彼此之后的序列。要使用它,您需要先对列表进行排序。另一种选择是使用Counter类:
from collections import Counter
countall = [[5, 0], [4, 1], [4, 1], [3, 2], [4, 1], [3, 2], [3, 2], [2, 3], [4, 1], [3, 2], [3, 2], [2, 3], [3, 2], [2, 3], [2, 3], [1, 4], [4, 1], [3, 2], [3, 2], [2, 3], [3, 2], [2, 3], [2, 3], [1, 4], [3, 2], [2, 3], [2, 3], [1, 4], [2, 3], [1, 4], [1, 4], [0, 5]]
Counter([tuple(x) for x in countall])
输出:
Counter({(3, 2): 10, (2, 3): 10, (1, 4): 5, (4, 1): 5, (5, 0): 1, (0, 5): 1})
答案 1 :(得分:3)
如ForceBru所指出的那样首先对你的列表进行排序,然后使用groupby:
from itertools import groupby
countall = [[5, 0], [4, 1], [4, 1], [3, 2], [4, 1], [3, 2], [3, 2], [2, 3], [4, 1], [3, 2], [3, 2], [2, 3], [3, 2], [2, 3], [2, 3], [1, 4], [4, 1], [3, 2], [3, 2], [2, 3], [3, 2], [2, 3], [2, 3], [1, 4], [3, 2], [2, 3], [2, 3], [1, 4], [2, 3], [1, 4], [1, 4], [0, 5]]
freq = [(key, len(list(x))) for key, x in groupby(sorted(countall))]
print(freq)
输出:
[([0, 5], 1), ([1, 4], 5), ([2, 3], 10), ([3, 2], 10), ([4, 1], 5), ([5, 0], 1)]
您的代码有错误:
freq = [len(list(group)) for x in countall for key, group in groupby(x)]
^paranthesis missing
然后,您将countall
中不需要的每个列表分组。
for x in countall for key, group in groupby(x)
你可以在排序(countall)
上直接groupby
另外,正如@Bemmu所回答,您可以使用collections.Counter。但是这不支持list
所以首先你必须将数据转换为tupple或string然后使用Counter
答案 2 :(得分:1)
如评论中所述,如果您使用的是groupby,则需要进行排序。
<强>代码:强>
import itertools as it
freq = {tuple(key): len(list(group)) for key, group in it.groupby(sorted(countall))}
测试代码:
countall = [[5, 0], [4, 1], [4, 1], [3, 2], [4, 1], [3, 2], [3, 2], [2, 3],
[4, 1], [3, 2], [3, 2], [2, 3], [3, 2], [2, 3], [2, 3], [1, 4],
[4, 1], [3, 2], [3, 2], [2, 3], [3, 2], [2, 3], [2, 3], [1, 4],
[3, 2], [2, 3], [2, 3], [1, 4], [2, 3], [1, 4], [1, 4], [0, 5]]
print(freq)
<强>结果:强>
{(3, 2): 10, (1, 4): 5, (2, 3): 10, (5, 0): 1, (0, 5): 1, (4, 1): 5}