Python - 截断未知文件名

时间:2015-02-19 19:19:17

标签: python

假设我在目录中有以下文件:

snackbox_1a.dat
zebrabar_3z.dat
cornrows_00.dat
meatpack_z2.dat

我有几个这些目录,其中所有文件格式相同,即:

snackbox_xx.dat
zebrabar_xx.dat
cornrows_xx.dat
meatpack_xx.dat

所以我对这些文件的了解是第一位(小吃盒,zebrabar,cornrows,meatpack)。我不知道的是文件扩展名('xx')的位。这会在文件的目录内以及目录中更改(因此另一个目录可能具有不同的xx值,如12,yy,2m,0t等)。

有没有办法让我重命名所有这些文件,或者将它们全部截断(因为xx.dat的长度总是相同),以便在尝试调用它们时易于使用?例如,我想重命名它们,以便我可以在另一个脚本中使用一个简单的索引来逐步查找我想要的文件(而不必进入每个目录并手动拉出文件)。 / p>

换句话说,我想将文件名更改为:

snackbox.dat
zebrabar.dat
cornrows.dat
meatpack.dat

谢谢!

3 个答案:

答案 0 :(得分:2)

您可以使用shutil.move移动文件。要计算新文件名,您可以使用Python的字符串split方法:

original_name = "snackbox_12.dat"
truncated_name = original.split("_")[0] + ".dat"

答案 1 :(得分:1)

尝试re.sub

import re
filename = 'snackbox_xx.dat'
filename_new = re.sub(r'_[A-Za-z0-9]{2}', '', filename)

'snackbox.dat'

你应该获得filename_new

这假设“_”后面的两个字符是数字或小写/大写字母,但您可以选择展开正则表达式中包含的类。

编辑:包括移动和recursive search

import shutil, re, os, fnmatch
directory = 'your_path'

for root, dirnames, filenames in os.walk(directory):
    for filename in fnmatch.filter(filenames, '*.dat'):
        filename_new = re.sub(r'_[A-Za-z0-9]{2}', '', filename)
        shutil.move(os.path.join(root, filename), os.path.join(root, filename_new))

答案 2 :(得分:0)

此解决方案重命名当前目录中与函数调用中的模式匹配的所有文件。

该功能的作用

snackbox_5R.txt  >>>  snackbox.txt
snackbox_6y.txt  >>>  snackbox_0.txt
snackbox_a2.txt  >>>  snackbox_1.txt
snackbox_Tm.txt  >>>  snackbox_2.txt

让我们看一下函数输入和一些例子。

list_of_files_names 这是一个字符串列表。其中每个字符串都是没有_??部分的文件名。

示例:

  • ['snackbox.txt', 'zebrabar.txt', 'cornrows.txt', 'meatpack.txt', 'calc.txt']

  • ['text.dat']

upper_bound=1000 这是一个整数。当已经采用理想文件名时,例如snackbox.dat已经存在,如果需要,它将一直创建snackbox_0.dat一直到snackbox_9999.dat。你不应该改变默认值。


守则

import re
import os
import os.path


def find_and_rename(dir, list_of_files_names, upper_bound=1000):
    """
    :param list_of_files_names: List. A list of string: filname (without the _??) + extension, EX: snackbox.txt
    Renames snackbox_R5.dat to snackbox.dat, etc.
    """
    # split item in the list_of_file_names into two parts, filename and extension "snackbox.dat" -> "snackbox", "dat"
    list_of_files_names = [(prefix.split('.')[0], prefix.split('.')[1]) for prefix in list_of_files_names]

    # store the content of the dir in a list
    list_of_files_in_dir = os.listdir(dir)

    for file_in_dir in list_of_files_in_dir:  # list all files and folders in current dir
        file_in_dir_full_path = os.path.join(dir, file_in_dir)  # we need the full path to rename to use .isfile()
        print()  # DEBUG
        print('Is "{}" a file?: '.format(file_in_dir), end='')  # DEBUG
        print(os.path.isfile(file_in_dir_full_path))  # DEBUG
        if os.path.isfile(file_in_dir_full_path):  # filters out the folder, only files are needed

            # Filename is a tuple containg the prefix filename and the extenstion
            for file_name in list_of_files_names:  # check if the file matches on of our renaming prefixes

                # match both the file name (e.g "snackbox") and the extension (e.g "dat")
                # It find "snackbox_5R.txt" by matching "snackbox" in the front and matching "dat" in the rear
                if re.match('{}_\w+\.{}'.format(file_name[0], file_name[1]), file_in_dir):
                    print('\nOriginal File: ' + file_in_dir)  # printing this is not necessary
                    print('.'.join(file_name))

                    ideal_new_file_name = '.'.join(file_name)  # name might already be taken
                    # print(ideal_new_file_name)
                    if os.path.isfile(os.path.join(dir, ideal_new_file_name)):  # file already exists
                        # go up a name, e.g "snackbox.dat" --> "snackbox_1.dat" --> "snackbox_2.dat
                        for index in range(upper_bound):
                            # check if this new name already exists as well
                            next_best_name = file_name[0] + '_' + str(index) + '.' + file_name[1]

                            # file does not already exist
                            if os.path.isfile(os.path.join(dir,next_best_name)) == False:
                                print('Renaming with next best name')
                                os.rename(file_in_dir_full_path, os.path.join(dir, next_best_name))
                                break

                            # this file exist as well, keeping increasing the name
                            else:
                                pass

                    # file with ideal name does not already exist, rename with the ideal name (no _##)
                    else:
                        print('Renaming with ideal name')
                        os.rename(file_in_dir_full_path, os.path.join(dir, ideal_new_file_name))


def find_and_rename_include_sub_dirs(master_dir, list_of_files_names, upper_bound=1000):
    for path, subdirs, files in os.walk(master_dir):
        print(path)  # DEBUG
        find_and_rename(path, list_of_files_names, upper_bound)


find_and_rename_include_sub_dirs('C:/Users/Oxen/Documents/test_folder', ['snackbox.txt', 'zebrabar.txt', 'cornrows.txt', 'meatpack.txt', 'calc.txt'])