我正在尝试让我的程序报告在文本文件中显示最多的单词。例如,如果我输入“你好我喜欢馅饼,因为它们非常好”,程序应该打印出来“就像发生的那样”。执行选项3时出现此错误:KeyError:'h'
#Prompt the user to enter a block of text.
done = False
textInput = ""
while(done == False):
nextInput= input()
if nextInput== "EOF":
break
else:
textInput += nextInput
#Prompt the user to select an option from the Text Analyzer Menu.
print("Welcome to the Text Analyzer Menu! Select an option by typing a number"
"\n1. shortest word"
"\n2. longest word"
"\n3. most common word"
"\n4. left-column secret message!"
"\n5. fifth-words secret message!"
"\n6. word count"
"\n7. quit")
#Set option to 0.
option = 0
#Use the 'while' to keep looping until the user types in Option 7.
while option !=7:
option = int(input())
#The error occurs in this specific section of the code.
#If the user selects Option 3,
elif option == 3:
word_counter = {}
for word in textInput:
if word in textInput:
word_counter[word] += 1
else:
word_counter[word] = 1
print("The word that showed up the most was: ", word)
答案 0 :(得分:2)
我想你可能想做:
for word in textInput.split():
...
目前,您只是遍历textInput
中的每个字符。因此,要迭代每个单词,我们必须首先将字符串拆分为一个单词数组。默认情况下,.split()
会在空格上进行拆分,但您可以通过将分隔符传递给split()
来更改此内容。
此外,您需要检查单词是否在您的词典中,而不是在原始字符串中。所以试试:
if word in word_counter:
...
然后,找到出现次数最多的条目:
highest_word = ""
highest_value = 0
for k,v in word_counter.items():
if v > highest_value:
highest_value = v
highest_word = k
然后,只需打印highest_word
和highest_value
的值。
要记录关系,请保留最高单词列表。如果我们发现更高的发生率,请清除列表并继续重建。到目前为止,这是完整的程序:
textInput = "He likes eating because he likes eating"
word_counter = {}
for word in textInput.split():
if word in word_counter:
word_counter[word] += 1
else:
word_counter[word] = 1
highest_words = []
highest_value = 0
for k,v in word_counter.items():
# if we find a new value, create a new list,
# add the entry and update the highest value
if v > highest_value:
highest_words = []
highest_words.append(k)
highest_value = v
# else if the value is the same, add it
elif v == highest_value:
highest_words.append(k)
# print out the highest words
for word in highest_words:
print word
答案 1 :(得分:2)
不是滚动自己的计数器,更好的办法是在集合模块中使用Counters。
>>> input = 'blah and stuff and things and stuff'
>>> from collections import Counter
>>> c = Counter(input.split())
>>> c.most_common()
[('and', 3), ('stuff', 2), ('things', 1), ('blah', 1)]
此外,作为一般代码风格的事情,请避免添加如下评论:
#Set option to 0.
option = 0
它会降低您的代码的可读性,而不是更多。
答案 2 :(得分:1)
原始答案肯定是正确的,但您可能要记住,它不会向您显示“先关系”。像
这样的句子 A life in the present is a present itself.
只会将'a'或'present'显示为头号命中。事实上,由于字典(通常)是无序的,你看到的结果可能甚至不是第一个重复多次的字。
如果您需要报告倍数,我可以建议以下内容:
1)将当前的键值对方法用于'word':'hits' 2)确定“命中”的最大值 3)检查等于最大命中数的值的数量,并将这些键添加到列表中 4)遍历列表以显示具有最大命中数的单词。
Par例子:
greatestNumber = 0
# establish the highest number for wordCounter.values()
for hits in wordCounter.values():
if hits > greatestNumber:
greatestNumber = hits
topWords = []
#find the keys that are paired to that value and add them to a list
#we COULD just print them as we iterate, but I would argue that this
#makes this function do too much
for word in wordCounter.keys():
if wordCounter[word] == greatestNumber:
topWords.append(word)
#now reveal the results
print "The words that showed up the most, with %d hits:" % greatestNumber
for word in topWords:
print word
根据Python 2.7或Python 3,您的里程(和语法)可能会有所不同。但理想情况下 - 恕我直言 - 你首先要确定最多的点击次数,然后再返回并将相关条目添加到新列表中。
编辑 - 您可能应该按照其他答案中的建议使用计数器模块。我甚至不知道这是Python准备做的事情。哈哈不接受我的回答,除非你必须必须写自己的柜台!似乎已经有了一个模块。
答案 3 :(得分:0)
使用Python 3.6+,您可以使用statistics.mode:
public interface IPolylinePath
{
SolidColorBrush PolylineColor { get; }
int PolylineThinkness { get; set; }
string PolylineTag { get; set; }
IEnumerable<BasicGeoposition> PolylinePoints { get; set; }
Geopath PolylineGeopath { get; }
PolylineColorMode PolylineColorMode { get; set; }
}
答案 4 :(得分:-1)
我对Python并不太热衷,但在你最后的印刷声明中,你不应该有%s吗?
即:打印(“出现最多的词是:%s”,字)