在目标上行走时移动重命名文件的更清洁方法

时间:2015-06-13 04:53:55

标签: python regex algorithm python-3.x file-io

我创建了一个脚本来遍历我的目录并将我的音乐文件移动到我的音乐文件夹,同时使用正则表达式库重命名文件以检查文件是否已编号(因为我发现这是一个烦恼)。该脚本似乎工作正常,因为我没有遇到任何错误,但是想知道是否有更简洁的方法来做这个,因为这看起来有点复杂和unpythonic。 (努力编写更清晰的代码)。更简洁的是,我并没有要求任何人重写我的整个代码块,而是改变了check_names函数及其在移动文件时的实现。如果还有其他错误,那么由于以下原因可能会有所帮助。

如果这是一个可怕的代码,我为任何python" sins"而道歉。我可能已经提交,因为我之前只使用过os和正则表达式模块。

由于我正在学习如何使用这些并且此时几乎没有曝光,因此解释为什么会有很长的路要求我的理解。所有os.path.join(....)都是我尝试跨平台而不必硬编码路径。我说这个看起来有点混乱的原因是由于运行时需要大约1-2分钟才能在6-7个文件夹上执行以下操作: 将压缩的.zip存档解压缩到我的原始目录中,然后根据需要重命名文件,最后移动它们然后返回到原始目录并删除移动的残余,或者这是正常的运行时对于一切正在进行的事情? (档案大约100-300 mb)

相关功能是。

def check_names(dirpath, file_name):
    check = False
    new_name = None
    first_check = re.compile("^\d\d - ").match(file_name)
    second_check = re.compile("^\d\d ").match(file_name)
    if first_check != None or second_check != None:
       check = True
       if first_check:
          new_name = file_name[first_check.span()[1]:]
          shutil.move(os.path.join(dirpath, file_name),
                      os.path.join(dirpath, new_name))
       else:
          new_name = file_name[second_check.span()[1]:]
          shutil.move(os.path.join(dirpath, file_name),
                      os.path.join(dirpath, new_name))
    return check, new_name

def move_music(source, destination, file_extension, sub_string):
    source_dir = os.path.split(source)[-1]
    for dirpath, dirnames, filenames in os.walk(source):
        for a_file in filenames:
            if (a_file.endswith(file_extension) and sub_string in a_file):
                check = check_names(dirpath, a_file)
                dir_name = os.path.split(dirpath)[-1]
                if dir_name != source_dir:
                    check_folders(destination, dir_name)
                    if os.path.join(source, dir_name) not in COPIED_DIRECTORIES:
                          COPIED_DIRECTORIES.append(os.path.join(source, dir_name))
                    shutil.move(os.path.join(dirpath, a_file if not check[0] else check[1]),
                                os.path.join(destination , dir_name))

                else:
                    shutil.move(os.path.join(dirpath, a_file if not check[0] else check[1]), destination)

1 个答案:

答案 0 :(得分:1)

首先要做的事情:

你可以打破这样的任务并显着提高可读性。

OLD:

if first_check:
  new_name = file_name[first_check.span()[1]:]
  shutil.move(os.path.join(dirpath, file_name),
              os.path.join(dirpath, new_name))
else:
   new_name = file_name[second_check.span()[1]:]
   shutil.move(os.path.join(dirpath, file_name),
               os.path.join(dirpath, new_name))

NEW:

if first_check:
    new_name = file_name[first_check.span()[1]:]
else:
    new_name = file_name[second_check.span()[1]:]
shutil.move(os.path.join(dirpath, file_name),
            os.path.join(dirpath, new_name))

您只是在if / else语句中将file_name更改为new_name,您不需要在if / then语句中进行长函数调用。

接下来,定义你的全局:

FIRST_CHECK = re.compile("^\d\d - ")
SECOND_CHECK = re.compile("^\d\d ")

<击>

接下来,正则表达式匹配会自动触发if语句,因此您可以删除:

if first_check != None:

并替换为:

if first_check:

接下来是重构你的代码:你的第二个功能是漫长而笨重的。分开来。将move_music更改为2个函数:

def move_music(source_dir, destination, dirpath, a_file):
    check = check_names(dirpath, a_file)
    dir_name = os.path.split(dirpath)[-1]
    if dir_name != source_dir:
        check_folders(destination, dir_name)
        if os.path.join(source, dir_name) not in COPIED_DIRECTORIES:
              COPIED_DIRECTORIES.append(os.path.join(source, dir_name))
        shutil.move(os.path.join(dirpath, a_file if not check[0] else check[1]),
                    os.path.join(destination , dir_name))

    else:
        shutil.move(os.path.join(dirpath, a_file if not check[0] else check[1]), destination)

def check_move(source, destination, file_extension, sub_string):
    source_dir = os.path.split(source)[-1]
    for dirpath, dirnames, filenames in os.walk(source):
        for a_file in filenames:
            if (a_file.endswith(file_extension) and sub_string in a_file):
                move_music(source_dir, destination, dirpath, a_file)

接下来,每行的代码长度太长:对于doc语句,自己限制为72个字符,实际代码限制为79个字符。这增强了可读性:编码器如何在分屏文本编辑器上查看代码?

您可以通过分解冗余语句并进一步重构代码来部分执行此操作:

def get_old_file(check, a_file):
    if not check[0]:
        old_file = a_file
    else:
        old_file = check[1]
    return old_file

这是您的结束代码:您仍然需要工作!评论它并添加文档字符串。

编辑,感谢@leewangzhong,他正确地指出有正则表达式缓存,并且因为我们只使用2个正则表达式,所以我们不需要担心显式编译。

def check_names(dirpath, file_name):
    check = False
    new_name = None
    first_check = re.match("^\d\d - ", file_name)
    second_check = re.match("^\d\d ", file_name)
    if first_check or second_check:
        check = True
        if first_check:
            new_name = file_name[first_check.span()[1]:]
        else:
            new_name = file_name[second_check.span()[1]:]
        old_path = os.path.join(dirpath, file_name)
        new_path = os.path.join(dirpath, new_name)
        shutil.move(old_path, new_path)
    return check, new_name


def get_old_file(check, a_file):
    if not check[0]:
        old_file = a_file
    else:
        old_file = check[1]
    return old_file

def move_music(source_dir, destination, dirpath, a_file):
    check = check_names(dirpath, a_file)
    dir_name = os.path.split(dirpath)[-1]
    if dir_name != source_dir:
        check_folders(destination, dir_name)
        if os.path.join(source, dir_name) not in COPIED_DIRECTORIES:
                path = os.path.join(source, dir_name)
                COPIED_DIRECTORIES.append(path)
        old_file = get_old_file(check, a_file)
        old_path = os.path.join(dirpath, old_file)
        new_path = os.path.join(destination , dir_name)
        shutil.move(old_path, new_path)

    else:
        old_file = get_old_file(check, a_file)
        old_path = os.path.join(dirpath, old_file)
        shutil.move(old_path, destination)

def check_move(source, destination, file_extension, sub_string):
    source_dir = os.path.split(source)[-1]
    for dirpath, dirnames, filenames in os.walk(source):
        for a_file in filenames:
            if (a_file.endswith(file_extension) 
                and sub_string in a_file):
                move_music(source_dir, destination, dirpath, a_file)