Python - 从大文本文件中检索特定句子

时间:2015-07-23 02:45:16

标签: python list count frequency

这是我的一句话:

s = "& how are you then? I am fine, % and i want to found some food #meat with vegetable# #tea# #cake# and #tea# so on."

我希望计算句子# #中受s约束的单词的频率。

我想要以下输出

[("meat with vegetable", 1)
 ("tea", 2)
 ("cake", 1)]

非常感谢您的帮助和时间!

1 个答案:

答案 0 :(得分:1)

使用reCounter的力量,可以轻松完成此任务:

In [1]: import re

In [2]: s = "& how are you then? I am fine, % and i want to found some food #meat with vegetable# #tea# #cake# and #tea# so on."

In [3]: re.findall(r'#([^#]*)#', s)
Out[3]: ['meat with vegetable', 'tea', 'cake', 'tea']

In [4]: from collections import Counter

In [5]: Counter(re.findall(r'#([^#]*)#', s))
Out[5]: Counter({'tea': 2, 'cake': 1, 'meat with vegetable': 1})

通过阅读python recollections.Counter上的文档获取更多信息。