任务

Question

任务

给定一个字符串列表，其中可能包含或不包含一个或多个单词，如何使用功能编程创建一个词频词典？通过函数式编程，我明确提到了使用map，filter或reduce。此外，表格理解也属于函数式编程的范畴。

代码

def count_individual_words(word_list):
    word_count = {x: y.count(x) for y in word_list for x in y.split()}
    return word_count

tweets = ["I am a cat", "cat", "Who is a good cat"]

for i,v in count_individual_words(tweets).items():
    print(i,v)

#Expected Output (dict)
# => {
# "I": 1,
# "am": 1,
# "a": 2,
# "cat": 3,
# "Who": 1,
# "is": 1,
# "good": 1 }

主要问题

当计算出现多次出现的单词时，主要问题就出现了，例如 cat 或 a 。问题是，不是在当前字数中加一，而是用一个覆盖字数。因此，最后，我得到的字典显示所有单词只出现一次。

我非常感谢您使用map，filter或reduce的任何提及，因为我很好奇如何使用这些给定的功能来完成此任务。

Answer 1

基本上这就是collections.Counter的用途。但是如果你想自己创建字典，你也可以使用集合模块中的defaultdict函数：

In [17]: from collections import defaultdict

In [18]: d = defaultdict(int)

In [20]: for sent in tweets:
             for word in sent.split():
                 d[word] += 1
   ....:         

In [21]: d
Out[21]: defaultdict(<class 'int'>, {'a': 2, 'is': 1, 'good': 1, 'am': 1, 'I': 1, 'cat': 3, 'Who': 1})

另一种效率不高的方法是使用列表理解和字典理解：

In [36]: all_words = [i for sub in tweets for i in sub.split()]

In [37]: {word: all}
all        all_words  

In [37]: {word: all_words.count(word) for word in set(all_words)}
Out[37]: {'a': 2, 'is': 1, 'Who': 1, 'am': 1, 'I': 1, 'cat': 3, 'good': 1}

使用函数式编程执行此操作可能如下所示：

In [38]: unique = set(all_words)

In [39]: dict(zip(unique, map(all_words.count, unique)))
Out[39]: {'a': 2, 'is': 1, 'Who': 1, 'am': 1, 'I': 1, 'cat': 3, 'good': 1}

Answer 2

最合乎逻辑的方法也使用功能编程，但只提供给collections.Counter：

import collections,itertools
collections.Counter(itertools.chain.from_iterable(x.split() for x in tweets))

如果您在不使用Counter的情况下计算/累积，这里有另一种方法：

生成链接/排序的单词列表
将它们分组并生成计算事件的字典

代码：

import itertools

tweets = ["I am a cat", "cat", "Who is a good cat"]

words = sorted(list(itertools.chain.from_iterable(x.split() for x in tweets)))
count = {k:len(list(v)) for k,v in itertools.groupby(words)}

结果：

{'cat': 3, 'I': 1, 'Who': 1, 'is': 1, 'am': 1, 'a': 2, 'good': 1}

甚至可能是单行的，但可读性会受到影响

（请注意list强制sorted加快操作速度

使用函数式编程计算单词的出现次数

任务

代码

主要问题

2 个答案: