Question

我正在尝试计算列表中每个术语出现的子列表数量，但我在第一步就陷入困境。例如，

collection = [['a','b','b','c','d','e','f'],['b','b','d','f']]

它应该归还给我（a，1）（b，2）（c，1）（d，2）（e，1）（f，2）

我可以遍历集合来打印所有内容

[item for sublist in collection for item in sublist]

我遇到的问题是，我不确定如何在发现事件后接收计数并进入下一个循环。

[item for sublist in collection for item in sublist if 'b' == item]

这会让我回头

['b', 'b', 'b', 'b']

我希望它能归还给我2.这就是我设想代码的方式。

count = [count++ for sublist in collection for item in sublist if 'b' == item]

Answer 1

您可以将collection展平为一组，然后找到计数：

collection = [['a','b','b','c','d','e','f'],['b','b','d','f']]
c = {i for b in collection for i in b}
final_results = [(i, sum(i in x for x in collection)) for i in c]

输出：

[('c', 1), ('d', 2), ('f', 2), ('e', 1), ('a', 1), ('b', 2)]

Answer 2

如果元素在子列表中，请使用sum和生成1的生成器表达式。

import itertools

collection = [['a','b','b','c','d','e','f'],['b','b','d','f']]

all_letters = set(itertools.chain.from_iterable(collection))
# or write them out by hand
# all_letters = {'a', 'b', 'c', 'd', 'e', 'f'}

result = [(ch, sum(1 for sublst in collection if ch in sublst)) for
          ch in all_letters]
# [('e', 1), ('d', 2), ('f', 2), ('b', 2), ('a', 1), ('c', 1)]
# or some other order, since sets are orderless.

Answer 3

如果你需要这样的全球统计数据，最好使用dictionary

counts = {}
for sublist in collection:
    for element in sublist:
        if element not in data:
            counts[element] = 0
            for sublist in collection:
                if element in sublist:
                    counts[element] += 1

也许不是最有效的，但可以完成工作。

计算每个术语在列表中显示的子列表数

3 个答案: