如何阅读单词,而不是完整的短语

时间:2016-04-21 20:33:51

标签: python

我有从文本文件中读取的代码,并打印出包含关键字(从输入中输入)的每一行。然而,输入的整个短语被搜索,而不是他们自己的单词。例如。我会进入

  

我有一辆快车

在文件中搜索整个短语,而不仅仅是关键字,例如fast或car。

file = open("file.txt", "r")
search= input("What to be searched?  ")
for line in file:
    if search in line:
        print ("found" +line)

file.close()

2 个答案:

答案 0 :(得分:4)

试试这个:

search = input("What to be searched?  ").split()

with open("file.txt", "r") as f:
    for line in f:
        if any(word in line for word in search):
            print ("found" + line)

答案 1 :(得分:1)

以前的答案处理了您的直接问题:搜索单词而不是行。但是,您提出了一个更难的问题:搜索字词。我推断,当你进入"我有一辆快车"时,你想忽略常用词并只搜索"快速"和" car"。

这在自然语言处理(NLP)中不是一个小问题。有些词在某些情况下很常见,但在其他情况下却很重要。为了简化问题,许多应用程序都有一个" stop"单词:那些被认为过于微不足道而无法用作关键字的单词。这会将你的解决方案变成这样的东西,修改@ Apero的代码:

search = input("Enter search text:  ").split()
-- Remove trivial words from search list; add your own words to this.
stop_word = ["a", "an", "the", "am", "is", "are", "have", "has"]
search = [word not in stop_word for word in search]

with open("file.txt", "r") as f:
    for line in f:
        if any(word in line for word in search):
            print ("found" + line)