Question

目前，我正在尝试在文本文件中搜索确切的字词/短语。我正在使用Python 3.4

这是我到目前为止的代码。

import re

def main():
    fileName = input("Please input the file name").lower()
    term = input("Please enter the search term").lower()

    fileName = fileName + ".txt"

    regex_search(fileName, term)

def regex_search(file,term):
    source = open(file, 'r')
    destination = open("new.txt", 'w')
    lines = []
    for line in source:
        if re.search(term, line):
            lines.append(line)

    for line in lines:
        destination.write(line)
    source.close()
    destination.close()
'''
def search(file, term): #This function doesn't work
    source = open(file, 'r')
    destination = open("new.txt", 'w')
    lines = [line for line in source if term in line.split()]

    for line in lines:
        destination.write(line)
    source.close()
    destination.close()'''
main()

在我的函数regex_search中，我使用正则表达式来搜索特定的字符串。但是，我不知道如何搜索特定的短语。

在第二个功能中，搜索，我将该行拆分为一个列表并在那里搜索该单词。但是，这将无法搜索特定的短语，因为我在['the'，'dog'，'walked']中搜索[“dog walked”]，这将不会返回正确的行。

Answer 1

编辑：考虑到您不想匹配部分单词（'foo'不匹配'foobar'），您需要向前看数据流。这个代码有点尴尬，所以我认为正则表达式（你当前的regex_search有一个修复）是要走的路：

def regex_search(filename, term):
    searcher = re.compile(term + r'([^\w-]|$)').search
    with open(file, 'r') as source, open("new.txt", 'w') as destination:
        for line in source:
            if searcher(line):
                destination.write(line)

Python在文本文件中搜索确切的单词/短语

1 个答案: