用于在目录中搜索特定文件类型的python脚本

时间:2018-11-03 19:37:25

标签: python python-3.x os.path

在下面的脚本中,大家好,我首先尝试列出主目录中的每个文件和目录,然后检查是否存在任何具有特定扩展名的文件,例如(.py,.mkv)。

它与位于主目录中的文件一起使用时效果很好,但是当我希望它检查其他目录以查看是否存在任何文件时,它将无法正常工作。

这是我的代码:

import os

class Sorter(object):
    path = os.environ['HOME']
    all_dirs = list()
    all_items = list()
    address = None
    movies = list()


    def __init__(self):
        pass

    def list_directories(self):
        dirs = os.listdir(self.path)
        for d in dirs:
            if os.path.isdir(os.path.join(self.path,d)):
                self.all_dirs.append(d)

            elif os.path.isfile(os.path.join(self.path,d)):
                self.all_items.append(d)

    def find_movies(self):

        for item in self.all_items:
            if os.path.splitext(os.path.join(self.path,item))[1] in ['.mp3','.mkv']:
                self.movies.append(item)
        for directory in self.all_dirs:
            try:
                os.chdir(os.path.join(self.path,directory))
                for i in directory:
                    if os.path.splitext(os.path.join(self.path,item))[1] in ['.mp3','.mkv']:
                        self.movies.append(item)
                os.chdir(self.path)
            except:
                pass

2 个答案:

答案 0 :(得分:3)

您可以使用标准库pathlib模块和glob通过文件扩展名搜索文件。

全球方言的功能不如bash强大,但是您可以使用**进行递归子目录匹配。您不能使用bash样式的括号扩展*.{mp3,mkv}。相反,您可以链接来自多个全局搜索的结果。

from pathlib import Path

def find_files(root, extensions):
    for ext in extensions:
        yield from Path(root).glob(f'**/*.{ext}')

for movie in find_files(Path.home() / 'Videos', ['mp4', 'mkv', 'avi']):
    print(movie)

请注意,路径对象重载了/运算符,因此Path.home() / 'Videos'将产生一个表示/home/username/Videos/的路径对象

答案 1 :(得分:1)

这似乎过于复杂。使用os.walk和列表理解来过滤文件,请参见以下方法:

import os

创建文件:

dirs = [r"./subdir",r"./subdir/tata",r"subdir/tarumpa",r"./dir2b"]
files = ["k.mp4","some.txt","cool.mp3"]

def touch(p,fn):
    with open(os.path.join(p,fn),"w") as f:
        f.write(" ")

for d in dirs:
    os.mkdir(d)
    for f in files:
        touch(d,f)

查找文件:

movie = []        
music = []        

# os.walk recurses into subdirectories, it returns a generator for each directory
# including the directory its in as root, all subdirs in dirs and all files in 
# files: then it steps into each of the dirs and does the same ...
for root,dirs,files in os.walk("./"):
    # root is the dir we are currently in, f the filename that ends on ...
    movie.extend( (os.path.join(root,f) for f in files if f.endswith(".mp4")) )
    music.extend( (os.path.join(root,f) for f in files if f.endswith(".mp3")) )

print(movie)
print(music)

输出:

# movies
['./subdir/k.mp4', './subdir/tarumpa/k.mp4', './subdir/tata/k.mp4', './dir2b/k.mp4']

# music
['./subdir/cool.mp3', './subdir/tarumpa/cool.mp3', 
 './subdir/tata/cool.mp3', './dir2b/cool.mp3']