Question

运行一个接受两个输入，一个输入文件和一个要搜索的单词的简单程序。然后，应打印出包含该单词的所有行。例如，我的输入文件包含5个句子，如下所示：

My cat is named garfield
He is my first Cat
My mom is named cathy
This is a catastrophe
Hello how are you

我要检查的单词是猫

这是我写的代码：

input_file = sys.argv[1]
input_file = open(input_file,"r")
wordCheck = sys.argv[2]

for line in input_file:
    if wordCheck in line:
        print line

input1.close()

现在显然，这将返回第1、3和4行，因为它们在某些时候都包含“ cat”。我的问题是，我将如何工作，以便仅打印第1行（仅有“ cat”一词的唯一行）？

第二个问题是，不管大小写如何，获取其中包含“ cat”一词的所有行的最佳方法是什么？因此，在这种情况下，您将返回行1和2，因为它们分别包含“ cat”和“ Cat”。提前致谢。

Answer 1

您可以使用regular expressions：

import re

# '\b': word boundary, re.I: case insensitive 
pat = re.compile(r'\b{}\b'.format(wordCheck), flags=re.I)

for line in input_file:
    if pat.search(line):
        print line

Answer 2

这是一种简短的方法，在单词列表上使用in而不是直接在字符串上使用。

word = 'cat'
for line in lines:
    if word in line.split(' '): # use `in` on a list of all the words of that line.
        print(line)

输出： My cat is named garfield

Answer 3

对于第一个问题，您可以使用break语句在获得第一个匹配项后停止循环

for line in input_file:
    if wordCheck in line.split(' '):
        print line
        break # add break here

对于第二个问题，您可以使用lower()函数将所有内容都转换为小写，以便可以检测到Cat和cat。

for line in input_file:
    if wordCheck in line.lower().split(' '):
        print line

如何返回仅包含特定单词的行

3 个答案: