我第一次练习python,遇到了这个疑问。使用可变文本,我输入了一个小段并根据空格将其分割。所以现在我有了该段落的单词,但这存储在字典中。接下来,我继续查找段落中每个单词的出现次数。我的最终动机是制作一个出现次数超过“ x”次的新单词列表。
我的代码是:
text = '''Population refers to the number of individuals in a particular
place. It could be the number of humans or any other life form living in a
certain specified area. The number of tigers living in a forest is
therefore referred to as the population of tigers in the forest. The
number of people living in a town or city or an entire country is the
human population in that particular area.'''
words = text.split(" ")
a = dict()
for word in words:
if word not in a:
a[word] = 1
else:
a[word]+= 1
newlist = list()
val = 7
for key,value in a.items():
if a[key]>val:
newlist.append(i)
执行最后一行后,我收到的最终输出是:
['years.', 'years.', 'years.', 'years.']
我不知道我要去哪里错
答案 0 :(得分:1)
为了创建一个以单词为键并以出现次数为值的字典,您需要首先获取所有唯一的单词。您可以使用Python的set
函数来做到这一点。
然后,您遍历该集合,并使用list
的{{3}}方法,可以获得每个单词的出现次数。
您可以在下面看到它:
text = '''Population refers to the number of individuals in a particular
place. It could be the number of humans or any other life form living in a
certain specified area. The number of tigers living in a forest is
therefore referred to as the population of tigers in the forest. The
number of people living in a town or city or an entire country is the
human population in that particular area.'''
words = text.split() # Split text and create a list of all words
wordset = set(words) # Get all unique words
wordDict = dict((word,words.count(word)) for word in wordset) # Create dictionary of words and number of occurences.
for key, value in wordDict.items():
print(key + ' : ' + str(value))
这将为您提供:
individuals : 1
forest : 1
the : 5
could : 1
therefore : 1
place. : 1
form : 1
or : 3
country : 1
population : 2
humans : 1
The : 2
city : 1
living : 3
Population : 1
life : 1
in : 6
a : 4
refers : 1
tigers : 2
is : 2
to : 2
be : 1
an : 1
other : 1
as : 1
particular : 2
number : 4
human : 1
It : 1
any : 1
forest. : 1
town : 1
that : 1
certain : 1
of : 5
entire : 1
people : 1
specified : 1
referred : 1
area. : 2
然后,您可以应用自己的过滤器来获取出现次数超过x
次的所有单词。