查找文件中的文件名(来自目录)

时间:2011-03-21 23:50:02

标签: python

我想找到我的程序是否记录了某种类型的所有文件。基本上,我有一个只有文件名的日志文件,然后我使用一个函数来运行文件来检查文件是否存在。现在内容很大但我粗暴地做了这件事。不幸的是,它无法正常工作。

import subprocess
import sys
import signal
import shutil
import os, fnmatch


#open file to read
f=open("logs", "r") #files are stored in this directory
o=open("all_output_logs","w")
e=open("missing_logs",'w')


def locate(pattern, root=os.curdir):
    '''Locate all files matching supplied filename pattern in and below
    supplied root directory.'''
        #ignore directories- ignore works, just uncomment. 
    #ignored = ["0201", "0306"]
    for path, dirs, files in os.walk(os.path.abspath(root)):
        #for dir in ignored:
           # if dir in dirs: 
                #dirs.remove(dir)
        for filename in fnmatch.filter(files, pattern):
            yield os.path.join(path, filename)



    #here i log all the files in the output file to search in
for line in f:
    if line.startswith("D:"):
        filename = line
        #print line
        o.write(filename)

f.close()
o.close()
r.close()

i=open("all_output_logs","r")
#primitive search.. going through each file in the directory to see if its there in  the log file
for filename in locate("*.dll"):
    for line in i:
        if filename in i:
            count=count+1
            print count
        else:
            e.write(filename)

我没有看到我的虚拟变量计数被打印,我只得到一个文件名,它在某种程度上位于列表的中间。

1 个答案:

答案 0 :(得分:1)

问题是文件中的行只在第一遍中读取,而文件对象(在您的情况下为i)不支持使用in运算符你期待。您可以将代码更改为:

lines = open("all_output_logs","r").readlines()
for filename in locate("*.dll"):
    for line in lines:
        if filename in line:
            count=count+1
            print count
        else:
            e.write(filename)

但它仍然效率低下而且有点尴尬。

由于您说日志文件是“huge”,那么您可能不希望将其全部读入内存,因此您必须快退每次查找:

f = open("all_output_logs","r")
for filename in locate("*.dll"):
    f.seek(0)
    for line in f:
        if filename in line:
            count=count+1
            print count
        else:
            e.write(filename)

我离开了in运算符,因为您没有指定日志文件的每一行包含的内容。人们原本期望filename == line.strip()是正确的比较。