计算给定字符串中列表中每个单词的频率

时间:2019-04-17 13:25:13

标签: python count text-files

我想对列表中每个单词的频率计数求和。 我能怎么做 ? 详细信息:

list = ['Apple', 'Mango' ,'Orange','p[éeêè]t[s]' ]
text = 'I have Apple and mood today, This morning i ate mango and pret then Orange'

在这种情况下,我想返回4。

4 个答案:

答案 0 :(得分:0)

您可以将str.countsum与生成器表达式一起使用。

>>> words = ['Apple', 'Mango', 'Orange' ]
>>> text = 'I have Apple and Mango mood today, This morning i ate Mango and then Orange'
>>> sum(text.count(word) for word in words)
4

答案 1 :(得分:0)

尝试:

call `admin`.`func_fill_table`(5000);
call `data`.`func_get_data`(5);

答案 2 :(得分:0)

您可以使用frequency来理解每个单词的dict, 然后,sum values得到总数,

>>> list_
['Apple', 'Mango', 'Orange']
>>> text
'I have Apple and Mango mood today, This morning i ate Mango and then Orange'
>>> y = {x: text.count(x) for x in list_}
>>> y
{'Orange': 1, 'Mango': 2, 'Apple': 1}
>>> sum(y.values())
4

问题更改后,您需要类似的内容

>>> import re
>>> list_ = ['Apple', 'Mango' ,'Orange', 'pr[éeêè]t[s]?' ]
>>> text
'I have Apple and mood today, This morning i ate mango and pret then Orange'
>>> re.findall(r'|'.join(list_), text)
['Apple', 'pret', 'Orange']
>>> len(re.findall(r'|'.join(list_), text))
3

如果您需要每个frequencies的{​​{1}},请使用word模块中的Counter

collections

对于案例>>> from collections import Counter >>> Counter(re.findall(r'|'.join(list_), text)) Counter({'Orange': 1, 'pret': 1, 'Apple': 1}) 搜索,

insensitive

答案 3 :(得分:0)

您可以将文本转换为列表,然后循环浏览此列表中的每个单词。如果单词在列表中,则增加一个计数器:

words = ['Apple', 'Mango', 'Orange' ]
text = 'I have Apple and Mango mood today, This morning i ate Mango and then Orange'

textlist = text.split(" ")  # split text to words;
counter = 0
for word in textlist:
    if word in words:
        counter+=1 
print(counter)

输出:

4

在以下代码中添加了删除逗号和单词结尾的句点:

textlist = text.split(" ")
print(textlist)
counter = 0
for word in textlist:
    if word[-1] in [',','.']:  # if last character is comma or period
        word = word[:-1]       # remove last character
    if word in words:
        counter+=1 
print(counter)
相关问题