如何在集合中使用Counter来计算Python中不同列表中的单词?

时间:2018-04-08 22:14:15

标签: python python-3.x collections count counter

我有以下代码:

def myFunc(word):
    for id, sList in enumerate(word):
        counts = Counter(sList)
        print(counts)


myFunc([['Apple', 'Orange', 'Banana'], ["Banana", "Orange"]])

输出:

Counter({'Apple': 1, 'Orange': 1, 'Banana': 1})
Counter({'Banana': 1, 'Orange': 1})

这很棒。但是,如果我想要这样的输出字典怎么办:

{'Apple': {'Orange':1, 'Banana': 1}, 'Orange': {'Apple':1, 'Banana':2},
  'Banana': {'Apple':1, 'Orange':2}}

这意味着键应该是我列表中的所有不同单词。值是所有单词count,仅包括键出现的列表。

1 个答案:

答案 0 :(得分:1)

我不知道实现此功能的任何功能,因此我编写了一个至少适用于我尝试的案例的片段,尽管解决方案不是很优雅。它包括笨拙嵌套的for循环和if语句。我相信可以找到更好的解决方案。

问题可以分为两部分:获取唯一键和相应的值。尽管Counter()也可以使用set(),但我很容易获得密钥。获得相应的值是棘手的部分。为此,我获取了每个唯一键并迭代字典以找到键所属的字典。找到字典后,取出字典中的其他键并迭代所有存在密钥的字典,以便对计数器进行总结。

from collections import Counter
# countered_list contains Counter() of individual lists.
countered_list = []
# Gives the unique keys.
complete = []
def myFunc(word):
    for each_list in word:
        complete.extend(each_list)
        countered_list.append(Counter(each_list))

    # set() can also be used instead of Counter()
    counts = Counter(complete)
    output = {key:{} for key in counts}

    # Start iteration with each key in count => key is unique
    for key in counts:
        # Iterate over the dictionaries in countered_list
        for each_dict in countered_list:
            # if key is in each_dict then iterate over all the other keys in dict
            if key in each_dict:
                for other_keys in each_dict:
                    # Excludes the key
                    if key != other_keys:
                        temp = 0
                        # Now iterate over all dicts for other_keys and add the value to temp
                        for every_dict in countered_list:
                            # Excludes the dictionaries in which key is not present.
                            if key in every_dict:
                                temp += every_dict[other_keys]
                        output[key][other_keys] = temp

    print(output)

以下是一些测试用例:

>>> new_list = [['a','a'],['b','b'],['c','c']]
>>> myFunc(new_list)
{'a': {}, 'c': {}, 'b': {}}
>>> new_list = [['a','a'],['b','b'],['c','c','a','a']]
>>> myFunc(new_list)
{'a': {'c': 2}, 'c': {'a': 2}, 'b': {}}
>>> new_list = [['a','a'],['b','b','a'],['c','c','a','a']]
>>> myFunc(new_list)
{'a': {'c': 2, 'b': 2}, 'c': {'a': 2}, 'b': {'a': 1}}
>>> new_list = [['ab','ba'],['ba','ab','ab'],['c','c','ab','ba']]
>>> myFunc(new_list)
{'c': {'ab': 1, 'ba': 1}, 'ab': {'c': 2, 'ba': 3}, 'ba': {'c': 2, 'ab': 4}}