Question

我正在学习Python，目前正在尝试创建一个脚本，用于搜索用户输入给出的字符串中最常见的1,000个单词（如here所示）。

到目前为止，我已经能够通过raw_input给出一个单词，搜索该列表（保存为.txt），并确定用户输入是否在文件中。但是，我似乎无法弄清楚如何搜索文本，只需回复“Word是1000字”或“Word不在列表中”。我只能为每一行回复“单词不在列表中”。

我基本上试图创建一个脚本来比较用户的输入，看看该输入中的所有单词是否都在1000个最常见的单词中（显然是由this XKCD漫画提示）。最后，我想'重新创建'this website所做的事情，但是使用Python脚本。

这是我到目前为止所拥有的：

cmnwords = open('C:\\Users\\[username]\\1000words.txt')
uInput = raw_input("What is your sentence? ")


def checkInput():
    for line in cmnwords:
        if uInput not in line:
            print uInput, "is not in the most common words"
        else:
            print uInput, "is OKAY! :D", line
checkInput()

上述类型的作品 - 但它在每一行之后回复。我只想知道“是的，用户的输入字符串在最常见的单词列表中”或“不！[单词]不是最常见的单词，再试一次”，而不必看到每行的答案。

（另外，我如何搜索确切的用户输入？如果你运行上面的命令，如果用户输入是“你”，它会认为“年轻”，“你自己”，其他人都没问题。不 - 我只是想要找到“你”。）

这有意义吗？感谢您的帮助，如果我能澄清任何问题，请告诉我。

Answer 1

让cmnwords成为包含1000个最常用单词的字符串列表。

然后，您可以使用word in cmnwords测试给定字符串是否在最常见的单词中。

cmnwords = open('C:\\Users\\[username]\\1000words.txt').read().splitlines()
# cmnwords = ['a', 'able', ...]
uInput = raw_input("What is your word? ")

def checkInput():
    # optionally you can use uInput.lower() below so that the search is case-insensitive
    if uInput in cmnwords:
        print uInput, "is not in the most common words"
    else:
        print uInput, "is OKAY! :D", line

checkInput()

这也解决了与你自己＆＃39;部分匹配的问题。正如你在问题中提到的那样。

Answer 2

您的第一个问题是您正在搜索整个输入字符串。用户输入"You will not go to space today"，那么只有当确切的字符串"You will not go to space today"位于cmnwords中时，您的程序才会成功。你想要做的是将输入分成单词，可能是这样的： words = [match.lower（）匹配re.findall（r“[a-z']”，uInput）]

您的第二个问题是，在检查is OKAY! :D的每一行后，您打印1000words.txt。等到你检查整个文件，然后打印OKAY。

搜索字符串的字符串

2 个答案: