Question

我有一套清单：

a = [{'foo','cpu','phone'},{'foo','mouse'}, {'dog','cat'}, {'cpu'}]

预期结果：

我想查看每个单独的字符串，进行计数并以原始格式返回所有内容x >= 2：

a = [{'foo','cpu'}, {'foo'}, {'cpu'}]

这是我到目前为止的内容，但是我停留在最后一部分，需要附加新列表：

from collections import Counter
counter = Counter()
for a_set in a:
    # Created a counter to count the occurrences a word
    counter.update(a_set)
result = []
for a_set in a:
    for word in a_set:
        if counter[word] >= 2:
            # Not sure how I should append my new set below.
            result.append(a_set)
            break
print(result)

Answer 1

您只是要附加原始集。因此，您应该使用至少出现两次的单词来创建一个新集合。

result = []
for a_set in a:
    new_set = {
            word for word in a_set
            if counter[word] >= 2
    }
    if new_set:  # check if new set is not empty
            result.append(new_set)

Answer 2

相反，请基于集合交集使用以下简短方法：

from collections import Counter

a = [{'foo','cpu','phone'},{'foo','mouse'}, {'dog','cat'}, {'cpu'}]
c = Counter([i for s in a for i in s])
valid_keys = {k for k,v in c.items() if v >= 2}
res = [s & valid_keys for s in a if s & valid_keys]

print(res)   # [{'cpu', 'foo'}, {'foo'}, {'cpu'}]

Answer 3

这就是我最终要做的：

建立一个计数器，然后遍历原始集集并过滤<2个计数的项目，然后过滤任何空集：

GtkCellRendererText

筛选具有特定条件的集合列表

3 个答案: