我正在使用以下列表:
["hello, how are you", "hello", "how are you doing","are you ok"]
如何获取每个元素内每个单词的频率?
列表的所有输出应如下所示:
you: 3
are: 3
hello: 2
how: 2
doing: 1
ok: 1
答案 0 :(得分:0)
使用collections.Counter
例如:
from collections import Counter
import string
data = ["hello, how are you", "hello", "how are you doing","are you ok"]
translator = str.maketrans('', '', string.punctuation)
d = Counter(" ".join(data).translate(translator).split())
#If python2
#d = Counter(" ".join(data).translate(None, string.punctuation).split())
print(d)
输出:
Counter({'are': 3, 'you': 3, 'how': 2, 'hello': 2, 'doing': 1, 'ok': 1})
答案 1 :(得分:0)
您可以使用collections.Counter
from intertools import Counter
import string
l=["hello, how are you", "hello", "how are you doing","are you ok"]
Counter([w.strip(string.punctuation) for s in l for w in s.split() ])
# Counter({'are': 3, 'you': 3, 'hello': 2, 'how': 2, 'doing': 1, 'ok': 1})
答案 2 :(得分:0)
def wordListToFreqDict(wordlist):
wordfreq = [wordlist.count(p) for p in wordlist]
return dict(zip(wordlist,wordfreq))
def sortFreqDict(freqdict):
aux = [(freqdict[key], key) for key in freqdict]
aux.sort()
aux.reverse()
return aux
a = ["hello, how are you", "hello", "how are you doing","are you ok"]
wordstring = ' '.join(a)
wordlist = wordstring.split()
wordfreq = [wordlist.count(w) for w in wordlist] # a list comprehension
dictionary = wordListToFreqDict(wordlist)
sorteddict = sortFreqDict(dictionary)
for s in sorteddict: print(str(s))
结果: (3,“你”) (3,“是”) (2,“如何”) (1,“确定”) (1,“你好”) (1,“你好”) (1,“正在做”)