Question

我正在尝试让我的程序报告在文本文件中显示最多的单词。例如，如果我输入“你好我喜欢馅饼，因为它们非常好”，程序应该打印出来“就像发生的那样”。执行选项3时出现此错误：KeyError：'h'

#Prompt the user to enter a block of text.
done = False
textInput = ""
while(done == False):
    nextInput= input()
    if nextInput== "EOF":
        break
    else:
        textInput += nextInput

#Prompt the user to select an option from the Text Analyzer Menu.
print("Welcome to the Text Analyzer Menu! Select an option by typing a number"
    "\n1. shortest word"
    "\n2. longest word"
    "\n3. most common word"
    "\n4. left-column secret message!"
    "\n5. fifth-words secret message!"
    "\n6. word count"
    "\n7. quit")

#Set option to 0.
option = 0

#Use the 'while' to keep looping until the user types in Option 7.
while option !=7:
    option = int(input())

#The error occurs in this specific section of the code.
#If the user selects Option 3,
    elif option == 3:
        word_counter = {}
        for word in textInput:
            if word in textInput:
                word_counter[word] += 1
            else:
                word_counter[word] = 1

        print("The word that showed up the most was: ", word)

Answer 1

我想你可能想做：

for word in textInput.split():
  ...

目前，您只是遍历textInput中的每个字符。因此，要迭代每个单词，我们必须首先将字符串拆分为一个单词数组。默认情况下，.split()会在空格上进行拆分，但您可以通过将分隔符传递给split()来更改此内容。

此外，您需要检查单词是否在您的词典中，而不是在原始字符串中。所以试试：

if word in word_counter:
  ...

然后，找到出现次数最多的条目：

highest_word = ""
highest_value = 0

for k,v in word_counter.items():
  if v > highest_value:
    highest_value = v
    highest_word = k

然后，只需打印highest_word和highest_value的值。

要记录关系，请保留最高单词列表。如果我们发现更高的发生率，请清除列表并继续重建。到目前为止，这是完整的程序：

textInput = "He likes eating because he likes eating"
word_counter = {}
for word in textInput.split():
  if word in word_counter:
    word_counter[word] += 1
  else:
    word_counter[word] = 1


highest_words = []
highest_value = 0

for k,v in word_counter.items():
  # if we find a new value, create a new list,
  # add the entry and update the highest value
  if v > highest_value:
    highest_words = []
    highest_words.append(k)
    highest_value = v
  # else if the value is the same, add it
  elif v == highest_value:
    highest_words.append(k)

# print out the highest words
for word in highest_words:
  print word

Answer 2

不是滚动自己的计数器，更好的办法是在集合模块中使用Counters。

>>> input = 'blah and stuff and things and stuff'
>>> from collections import Counter
>>> c = Counter(input.split())
>>> c.most_common()
[('and', 3), ('stuff', 2), ('things', 1), ('blah', 1)]

此外，作为一般代码风格的事情，请避免添加如下评论：

#Set option to 0.
option = 0

它会降低您的代码的可读性，而不是更多。

Answer 3

原始答案肯定是正确的，但您可能要记住，它不会向您显示“先关系”。像

这样的句子

A life in the present is a present itself.

只会将'a'或'present'显示为头号命中。事实上，由于字典（通常）是无序的，你看到的结果可能甚至不是第一个重复多次的字。

如果您需要报告倍数，我可以建议以下内容：

1）将当前的键值对方法用于'word'：'hits' 2）确定“命中”的最大值 3）检查等于最大命中数的值的数量，并将这些键添加到列表中 4）遍历列表以显示具有最大命中数的单词。

Par例子：

greatestNumber = 0
# establish the highest number for wordCounter.values()
for hits in wordCounter.values():
    if hits > greatestNumber:
        greatestNumber = hits

topWords = []
#find the keys that are paired to that value and add them to a list
#we COULD just print them as we iterate, but I would argue that this
#makes this function do too much
for word in wordCounter.keys():
    if wordCounter[word] == greatestNumber:
        topWords.append(word)

#now reveal the results
print "The words that showed up the most, with %d hits:" % greatestNumber
for word in topWords:
    print word

根据Python 2.7或Python 3，您的里程（和语法）可能会有所不同。但理想情况下 - 恕我直言 - 你首先要确定最多的点击次数，然后再返回并将相关条目添加到新列表中。

编辑 - 您可能应该按照其他答案中的建议使用计数器模块。我甚至不知道这是Python准备做的事情。哈哈不接受我的回答，除非你必须必须写自己的柜台！似乎已经有了一个模块。

Answer 4

使用Python 3.6+，您可以使用statistics.mode：

public interface IPolylinePath
{
    SolidColorBrush PolylineColor { get; }

    int PolylineThinkness { get; set; }

    string PolylineTag { get; set; }

    IEnumerable<BasicGeoposition> PolylinePoints { get; set; }

    Geopath PolylineGeopath { get; }

    PolylineColorMode PolylineColorMode { get; set; }


}

Answer 5

我对Python并不太热衷，但在你最后的印刷声明中，你不应该有％s吗？

即：打印（“出现最多的词是：％s”，字）

Python：查找显示最多的单词？

5 个答案: