如何遍历指定的每个目录并在文件上运行命令(Python)

时间:2016-09-05 12:00:03

标签: python loops

我一直在研究一个脚本,该脚本将检查目录中的每个子目录并使用正则表达式匹配文件,然后根据文件的类型使用不同的命令。

所以我完成的是使用基于正则表达式匹配的不同命令。现在它检查.zip文件,.rar文件或.r00文件,并为每个匹配使用不同的命令。但是我需要帮助迭代每个目录并首先检查那里是否有.mkv文件,然后它应该只是传递该目录并跳转到下一个目录,但是如果有匹配则它应该运行命令然后当它完成时继续到下一个目录。

import os
import re

rx = '(.*zip$)|(.*rar$)|(.*r00$)'
path = "/mnt/externa/folder"

for root, dirs, files in os.walk(path):

    for file in files:
        res = re.match(rx, file)
        if res:
            if res.group(1):
                print("Unzipping ",file, "...")
                os.system("unzip " + root + "/" + file + " -d " + root)
            elif res.group(2):
                os.system("unrar e " + root + "/" + file + " " + root)
            if res.group(3):
                print("Unraring ",file, "...")
                os.system("unrar e " + root + "/" + file + " " + root)

编辑:

这是我现在的代码:

import os
import re
from subprocess import check_call
from os.path import join

rx = '(.*zip$)|(.*rar$)|(.*r00$)'
path = "/mnt/externa/Torrents/completed/test"

for root, dirs, files in os.walk(path):
    if not any(f.endswith(".mkv") for f in files):
        found_r = False
        for file in files:
            pth = join(root, file)
            try:
                 if file.endswith(".zip"):
                    print("Unzipping ",file, "...")
                    check_call(["unzip", pth, "-d", root])
                    found_zip = True
                 elif not found_r and file.endswith((".rar",".r00")):
                     check_call(["unrar","e","-o-", pth, root,])
                     found_r = True
                     break
            except ValueError:
                print ("Oops! That did not work")

这个脚本工作得很好但有时我似乎遇到问题,当文件夹中有Subs时,这是一个错误我在运行脚本时得到的消息:

$ python unrarscript.py

UNRAR 5.30 beta 2 freeware      Copyright (c) 1993-2015    Alexander Roshal


Extracting from /mnt/externa/Torrents/completed/test/The.Conjuring.2013.1080p.BluRay.x264-ALLiANCE/Subs/the.conjuring.2013.1080p.bluray.x264-alliance.subs.rar

No files to extract
Traceback (most recent call last):
  File "unrarscript.py", line 19, in <module>
    check_call(["unrar","e","-o-", pth, root])
  File "/usr/lib/python2.7/subprocess.py", line 541, in     check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['unrar', 'e', '-o-', '/mnt/externa/Torrents/completed/test/The.Conjuring.2013.1080p.BluRay.x264-ALLiANCE/Subs/the.conjuring.2013.1080p.bluray.x264-alliance.subs.rar', '/mnt/externa/Torrents/completed/test/The.Conjuring.2013.1080p.BluRay.x264-ALLiANCE/Subs']' returned non-zero exit status 10

我无法真正理解代码的错误,所以我希望有些人愿意帮助我。

4 个答案:

答案 0 :(得分:2)

只需使用任意查看是否有任何文件以.mkv结尾,然后再进行任何操作,您也可以简化为 if / else 最后两场比赛同样的事情。使用subprocess.check_call也是一种更好的方法:

import os
import re
from subprocess import check_call
from os.path import join

rx = '(.*zip$)|(.*rar$)|(.*r00$)'
path = "/mnt/externa/folder"


for root, dirs, files in os.walk(path):
    if not any(f.endswith(".mkv") for f in files):
        for file in files:
            res = re.match(rx, file)
            if res:
                # use os.path.join 
                pth = join(root, file)
                # it can only be res.group(1) or  one of the other two so we only need if/else. 
                if res.group(1): 
                    print("Unzipping ",file, "...")
                    check_call(["unzip" , pth, "-d", root])
                else:
                    check_call(["unrar","e", pth,  root])

您也可以忘记rex并使用if / elif和str.endswith:

for root, dirs, files in os.walk(path):
    if not any(f.endswith(".mkv") for f in files):
        for file in files:
            pth = join(root, file)
            if file.endswith("zip"):
                print("Unzipping ",file, "...")
                check_call(["unzip" , pth, "-d", root])
            elif file.endswith((".rar",".r00")):
                check_call(["unrar","e", pth,  root])

如果你真的不关心不重复步骤和速度,你可以在迭代时进行过滤,你可以通过切片收集扩展,当你检查.mkv并使用/ else逻辑时:

good = {"rar", "zip", "r00"}
for root, dirs, files in os.walk(path):
    if not any(f.endswith(".mkv") for f in files):
        tmp = {"rar": [], "zip": []}
        for file in files:
            ext = file[-4:]
            if ext == ".mkv":
                break
            elif ext in good:
                tmp[ext].append(join(root, file))
        else:
            for p in tmp.get(".zip", []):
                print("Unzipping ", p, "...")
                check_call(["unzip", p, "-d", root])
            for p in tmp.get(".rar", []):
                check_call(["unrar", "e", p, root])

这会对.mkv的任何匹配产生短路,或者只对.rar.r00的任何匹配进行迭代,但除非您真的关心效率,否则我会使用第二个逻辑。

为避免覆盖,您可以使用计数器将每个目录解压缩/解压缩到一个新的子目录,以帮助创建新的目录名称:

from itertools import count


for root, dirs, files in os.walk(path):
        if not any(f.endswith(".mkv") for f in files):
            counter = count()
            for file in files:
                pth = join(root, file)
                if file.endswith("zip"):
                    p = join(root, "sub_{}".format(next(counter)))
                    os.mkdir(p)
                    print("Unzipping ",file, "...")
                    check_call(["unzip" , pth, "-d", p])
                elif file.endswith((".rar",".r00")):
                    p = join(root, "sub_{}".format(next(counter)))
                    os.mkdir(p)
                    check_call(["unrar","e", pth,  p])

每个都将被解压缩到根目录下的新目录,即root_path/sub_1等..

你可能最好在你的问题中添加一个例子,但如果真正的问题是你只需要.rar或.r00之一,那么你可以在找到.rar或.r00的任何匹配时设置一个标志。只有在未设置标志时才解包:

for root, dirs, files in os.walk(path):
    if not any(f.endswith(".mkv") for f in files):
        found_r = False
        for file in files:
            pth = join(root, file)
            if file.endswith("zip"):
                print("Unzipping ",file, "...")
                check_call(["unzip", pth, "-d", root])
                found_zip = True
            elif not found_r and file.endswith((".rar",".r00"))
                check_call(["unrar","e", pth,  root])
                found_r = True     

如果只有一个zip,你可以设置两个标志,并将循环留在两者都设置的地方。

答案 1 :(得分:1)

以下示例将直接使用!正如@Padraic所建议的,我用更合适的子进程替换了os.system。

如何加入单个字符串中的所有文件并在字符串中查找* .mkv?

import os
import re
from subprocess import check_call
from os.path import join

rx = '(.*zip$)|(.*rar$)|(.*r00$)'
path = "/mnt/externa/folder"
regex_mkv = re.compile('.*\.mkv\,')
for root, dirs, files in os.walk(path):

    string_files = ','.join(files)+', '
    if regex_mkv.match(string_files): continue

    for file in files:
        res = re.match(rx, file)
        if res:
            # use os.path.join 
            pth = join(root, file)
            # it can only be res.group(1) or  one of the other two so we only need if/else. 
            if res.group(1): 
                print("Unzipping ",file, "...")
                check_call(["unzip" , pth, "-d", root])
            else:
                check_call(["unrar","e", pth,  root])

答案 2 :(得分:0)

re对于这样的事情来说太过分了。有一个用于提取文件扩展名的库函数os.path.splitext。在下面的示例中,我们构建了一个扩展名到文件名的映射,我们使用它来检查常量时间内.mkv个文件的存在,并将每个文件名映射到适当的命令。

请注意,您可以使用zipfile(标准库)和第三方软件包are available for .rar files解压缩文件。

import os

for root, dirs, files in os.walk(path):
    ext_map = {}
    for fn in files:
        ext_map.setdefault(os.path.splitext(fn)[1], []).append(fn)
    if '.mkv' not in ext_map:
        for ext, fnames in ext_map.iteritems():
            for fn in fnames:
                if ext == ".zip":
                    os.system("unzip %s -d %s" % (fn, root))
                elif ext == ".rar" or ext == ".r00":
                    os.system("unrar %s %s" % (fn, root))

答案 3 :(得分:-2)

import os
import re

regex = re.complile(r'(.*zip$)|(.*rar$)|(.*r00$)')
path = "/mnt/externa/folder"
for root, dirs, files in os.walk(path):
    for file in files:
        res = regex.match(file)
        if res:
           if res.group(1):
              print("Unzipping ",file, "...")
              os.system("unzip " + root + "/" + file + " -d " + root)
           elif res.group(2):
              os.system("unrar e " + root + "/" + file + " " + root)
           else:
              print("Unraring ",file, "...")
              os.system("unrar e " + root + "/" + file + " " + root)