Question

在开始之前，这里是我学习资料的确切说明（Grok Learning，Python） “编写一个程序来读取来自用户的多行输入，其中每一行是一个以空格分隔的单词句子。然后你的程序应该计算每个输入句子在所有输入句子中出现的次数。这些双字母应该是通过将输入行转换为小写，以不区分大小写的方式处理。一旦用户停止输入输入，您的程序应打印出多次出现的每个双字母组及其相应的频率。“

我应该在几个输入中找到bigrams并且我已经制定了这个代码。此代码通过询问输入直到输入为空，然后将整行添加到名为combined的列表中，然后将其转换为此格式的bigrams [（'this'，'is'），（'is'，' a'）]等于名为text的列表。现在名为text的列表被转换为这种格式的简单双字母[（'this is'），（'is a'）]到另一个名为newlist的列表中。然后我将所有重复的字符串添加到名为my_dict的字典中并添加它们。我将它们分开打印出来，这样就可以产生每个二元组以及它的频率，排除只发生过一次的双字母组。这是我的代码：

newlist = []
combined = []
a = (input("Line: ")).lower()
while a:
  combined.append(a)
  text = [b for l in combined for b in zip(l.split(" ")[:-1], l.split(" ")[1:])]
  a = (input("Line: ")).lower()
for bigram in text:
  newbigram = ' '.join(bigram)
  newlist.append(newbigram)
my_dict = {i:newlist.count(i) for i in newlist}
for words in sorted(my_dict):
  if my_dict[words] > 1:
    print(str(words) + str(": ") + str(my_dict[words]))

这是我的输出：

Line: The big red ball
Line: The big red ball is near the big red box
Line: I am near the box
Line: 
big red: 3
near the: 2
red ball: 2
the big: 3

看到这段代码工作正常，但每当我设置一个空值时，它都会出现以下错误消息：

Line: 
Traceback (most recent call last):
  File "program.py", line 8, in <module>
    for bigram in text:
NameError: name 'text' is not defined

为什么会这样，我该如何解决？

Answer 1

您的主要问题是由于 empty 输入实际上包含\n的值，因此您的while语句的评估结果为True。您可以删除用户输入，这样您就不会包含尾随/前导空格，因为无论如何它都无关紧要，例如：

a = input("Line: ").strip().lower()

如果用户从未输入任何内容（只有空行），则text列表将永远不会被初始化，因此请在while循环之前对其进行初始化。

话虽如此，你过度复杂了 - 你需要的只是一个字典计数器，并迭代输入的元组以增加计数，例如：

import collections

counter = collections.defaultdict(int)  # use a dictionary factory to speed up our counting
while True:
    line = input("Line: ").lower().split()  # lowercase and split on whitespace
    if not line:  # user entered an empty line
        break
    for bigram in zip(line, line[1:]):  # iterate over bigrams
        counter[bigram] += 1  # increase the count of each bigram
for bigram, count in counter.items():  # use counter.iteritems() on Python 2.x
    if count > 1:  # consider only bigrams with larger count than 1
        print("{} {}: {}".format(bigram[0], bigram[1], count))

如何计算python

1 个答案: