筛选具有特定条件的集合列表

时间:2019-09-08 17:57:26

标签: python list set

我有一套清单:

a = [{'foo','cpu','phone'},{'foo','mouse'}, {'dog','cat'}, {'cpu'}]

预期结果:

我想查看每个单独的字符串,进行计数并以原始格式返回所有内容x >= 2

a = [{'foo','cpu'}, {'foo'}, {'cpu'}]

这是我到目前为止的内容,但是我停留在最后一部分,需要附加新列表:

from collections import Counter
counter = Counter()
for a_set in a:
    # Created a counter to count the occurrences a word
    counter.update(a_set)
result = []
for a_set in a:
    for word in a_set:
        if counter[word] >= 2:
            # Not sure how I should append my new set below.
            result.append(a_set)
            break
print(result)

3 个答案:

答案 0 :(得分:0)

您只是要附加原始集。因此,您应该使用至少出现两次的单词来创建一个新集合。

result = []
for a_set in a:
    new_set = {
            word for word in a_set
            if counter[word] >= 2
    }
    if new_set:  # check if new set is not empty
            result.append(new_set)

答案 1 :(得分:0)

相反,请基于集合交集使用以下简短方法:

from collections import Counter

a = [{'foo','cpu','phone'},{'foo','mouse'}, {'dog','cat'}, {'cpu'}]
c = Counter([i for s in a for i in s])
valid_keys = {k for k,v in c.items() if v >= 2}
res = [s & valid_keys for s in a if s & valid_keys]

print(res)   # [{'cpu', 'foo'}, {'foo'}, {'cpu'}]

答案 2 :(得分:0)

这就是我最终要做的:

建立一个计数器,然后遍历原始集集并过滤<2个计数的项目,然后过滤任何空集:

GtkCellRendererText