从列表中的元素中获取最常用的单词

时间:2018-07-04 09:39:15

标签: python python-3.x

我正在使用以下列表:

["hello, how are you", "hello", "how are you doing","are you ok"]

如何获取每个元素内每个单词的频率?

列表的所有输出应如下所示:

you: 3
are: 3
hello: 2
how: 2
doing: 1
ok: 1

3 个答案:

答案 0 :(得分:0)

使用collections.Counter

例如:

from collections import Counter
import string

data = ["hello, how are you", "hello", "how are you doing","are you ok"]
translator = str.maketrans('', '', string.punctuation)

d = Counter(" ".join(data).translate(translator).split())
#If python2
#d = Counter(" ".join(data).translate(None, string.punctuation).split())

print(d)

输出:

Counter({'are': 3, 'you': 3, 'how': 2, 'hello': 2, 'doing': 1, 'ok': 1})

答案 1 :(得分:0)

您可以使用collections.Counter

from intertools import Counter
import string

l=["hello, how are you", "hello", "how are you doing","are you ok"]

Counter([w.strip(string.punctuation) for s in l for w in s.split() ])
# Counter({'are': 3, 'you': 3, 'hello': 2, 'how': 2, 'doing': 1, 'ok': 1})

答案 2 :(得分:0)

def wordListToFreqDict(wordlist):
    wordfreq = [wordlist.count(p) for p in wordlist]
    return dict(zip(wordlist,wordfreq))

def sortFreqDict(freqdict):
    aux = [(freqdict[key], key) for key in freqdict]
    aux.sort()
    aux.reverse()
    return aux

a = ["hello, how are you", "hello", "how are you doing","are you ok"]
wordstring = ' '.join(a)
wordlist = wordstring.split()

wordfreq = [wordlist.count(w) for w in wordlist] # a list comprehension

dictionary = wordListToFreqDict(wordlist)
sorteddict = sortFreqDict(dictionary)

for s in sorteddict: print(str(s))

结果: (3,“你”) (3,“是”) (2,“如何”) (1,“确定”) (1,“你好”) (1,“你好”) (1,“正在做”)