我正在尝试编写一个脚本来扫描根目录中每个文件的关键短语,并打印出这些包含关键短语的文件中的行。目前我的脚本能够扫描我目录中的所有文件并打印出包含关键短语的行,但是当它只在这些文件中存在一次时会打印大约70次。例如,如果整个目录中有 ONE LINE ,它将打印出一个特定的行,如70次。我怀疑我的for循环有问题,但我无法调试问题。我希望让我的脚本只打印这些行,它们实际出现在文件中的次数。我是Python的新手,请帮忙!谢谢!
import os
path = "D:\\MyFolder\\" #The root directory I hope to scan all the files from
important = [] #The array for storing the lines with the key phrases
key_phrases = ["testing example1", "testing example2"] #If the lines contain these key phrases, it should be stored into the important array and printed in the end
for (path, subdirs, files) in os.walk(path):
files = [f for f in os.listdir(path) if f.endswith('.txt') or f.endswith('.log')] #The follow 3 lines are to make the file openable
files.sort() # file is sorted list
files = [os.path.join(path, name) for name in files]
for filename in files:
with open(filename) as f:
f = f.readlines()
for line in f:
for phrase in key_phrases:
if phrase in line:
important.append(line)
break
print(important)