我想找到我的程序是否记录了某种类型的所有文件。基本上,我有一个只有文件名的日志文件,然后我使用一个函数来运行文件来检查文件是否存在。现在内容很大但我粗暴地做了这件事。不幸的是,它无法正常工作。
import subprocess
import sys
import signal
import shutil
import os, fnmatch
#open file to read
f=open("logs", "r") #files are stored in this directory
o=open("all_output_logs","w")
e=open("missing_logs",'w')
def locate(pattern, root=os.curdir):
'''Locate all files matching supplied filename pattern in and below
supplied root directory.'''
#ignore directories- ignore works, just uncomment.
#ignored = ["0201", "0306"]
for path, dirs, files in os.walk(os.path.abspath(root)):
#for dir in ignored:
# if dir in dirs:
#dirs.remove(dir)
for filename in fnmatch.filter(files, pattern):
yield os.path.join(path, filename)
#here i log all the files in the output file to search in
for line in f:
if line.startswith("D:"):
filename = line
#print line
o.write(filename)
f.close()
o.close()
r.close()
i=open("all_output_logs","r")
#primitive search.. going through each file in the directory to see if its there in the log file
for filename in locate("*.dll"):
for line in i:
if filename in i:
count=count+1
print count
else:
e.write(filename)
我没有看到我的虚拟变量计数被打印,我只得到一个文件名,它在某种程度上位于列表的中间。
答案 0 :(得分:1)
问题是文件中的行只在第一遍中读取,而文件对象(在您的情况下为i
)不支持使用in
运算符你期待。您可以将代码更改为:
lines = open("all_output_logs","r").readlines()
for filename in locate("*.dll"):
for line in lines:
if filename in line:
count=count+1
print count
else:
e.write(filename)
但它仍然效率低下而且有点尴尬。
由于您说日志文件是“huge”,那么您可能不希望将其全部读入内存,因此您必须快退每次查找:
f = open("all_output_logs","r")
for filename in locate("*.dll"):
f.seek(0)
for line in f:
if filename in line:
count=count+1
print count
else:
e.write(filename)
我离开了in
运算符,因为您没有指定日志文件的每一行包含的内容。人们原本期望filename == line.strip()
是正确的比较。