根据名称条件更改文件/文件夹树结构

时间:2016-10-11 18:04:43

标签: python path shutil

我有一个像这样的文件夹/文件树:

/source/photos/d831fae7-ed7f-44b1-8345-54fc54f0710f/car/1.jpg
/source/photos/20a33e40-8bb2-4ebe-b703-632115ba6714/house/
/source/photos/20a33e40-8bb2-4ebe-b703-632115ba6714/boat/b6a1b8bf-7f4c-45d6-84c1-37fbb8204328/2.jpg
/source/20dd7963-0d4a-4a80-83f8-4800de672087/music/1.mp3
/source/64e997aa-bb7e-4cdf-9348-8b8d48e2d336/music/c6a0b1d4-9d2d-4a21-bce3-8c922f8ad55b/2.mp3
/source/movies/83e760f4-7235-4d7e-bd51-56aa82192a94/572f3820-ea22-40c1-903a-31b7f412ae38/1.mp4
/source/movies/993209ed-092a-4665-a5d1-4ce537e2a680/4c200cf1-eb6b-40a7-84d7-9a2db0f75e09/1.mp4

为了方便阅读上一棵树,这里的表示更为简单:

/source/photos/uuid0/car/1.jpg
/source/photos/uuid1/house/
/source/photos/uuid1/boat/uuid2/2.jpg
/source/uuid3/music/1.mp3
/source/uuid4/music/uuid5/2.mp3
/source/movies/uuid6/uuid7/1.mp4
/source/movies/uuid8/uuid9/1.mp4

我想将文件夹和文件从“source”移动到“destination”目录,并在动态树形结构中执行调整。生成的树应如下所示:

/destination/photos/car/1.jpg
/destination/photos/house/
/destination/photos/boat/2.jpg
/destination/music/1.mp3
/destination/music/2.mp3
/destination/movies/1_1.mp4
/destination/movies/1_2.mp4

如你所见,我想:

  • 忽略路径中间的每个uuid;
  • 如果文件或文件夹名称存在冲突(即1.mp4),则应添加增量后缀(即1_1.mp4);
  • 移动空文件夹(例如“house”);
  • 移动时避免冲突(在移动子目录之前移动父目录) - 它应递归侵入树并移动其内容。

我尝试使用os.walk解析路径,但无法实现此目的。

有什么想法吗?谢谢!

注意: uuid(即6e56c11b-3adf-440e-96f5-375884c96c55)可以使用以下函数进行检查:

import uuid
def validate_uuid4(uuid_string):
    try:
        val = uuid.UUID(uuid_string, version=4)
    except ValueError:
        return False
    return True

编辑:示例代码

主要问题是:

给出以下结构

.
├── Icon\r
└── folder1
    ├── d.txt
    └── folder1.1
        ├── 64e997aa-bb7e-4cdf-9348-8b8d48e2d336
        │   └── a.mkv
        └── d831fae7-ed7f-44b1-8345-54fc54f0710f
            ├── b.mkv
            └── d831fae7-ed7f-44b1-8345-54fc54f0710f
                ├── b.mkv
                └── c.jpg

使用此代码:

#!/usr/bin/python
import os
import uuid

args = {}
args['rootdirOriginal'] = "/Users/xxx/Desktop/UploadDropbox"
pathString = []
pathStringClean=[]

def validate_uuid4(uuid_string):
    try:
        val = uuid.UUID(uuid_string, version=4)
    except ValueError:
        return False
    return True


for dirpath, dirs, files in os.walk(args['rootdirOriginal']):
    if files:
        for f in files:
            pathTmp = []
            pathRelative = os.path.relpath(dirpath, args['rootdirOriginal'])
            for p in pathRelative.split("/"): 
                pathTmp.append(p)
            pathTmp.append(f)   

            pathTmpClean = [x for x in pathTmp if not validate_uuid4(x) and x[0] != "." and x[0:4]!="Icon"]

            pathStringTmp = ("/").join(pathTmp)
            pathStringTmpClean = ("/").join(pathTmpClean)

            if len(pathTmp) > 0:
                pathString.append(pathStringTmp)
                pathStringClean.append(pathStringTmpClean)

print pathString
print pathStringClean

这是第一个输出:

['./.DS_Store', './Icon\r', 'folder1/.DS_Store', 'folder1/d.txt', 'folder1/folder1.1/.DS_Store', 'folder1/folder1.1/64e997aa-bb7e-4cdf-9348-8b8d48e2d336/.DS_Store', 'folder1/folder1.1/64e997aa-bb7e-4cdf-9348-8b8d48e2d336/a.mkv', 'folder1/folder1.1/d831fae7-ed7f-44b1-8345-54fc54f0710f/.DS_Store', 'folder1/folder1.1/d831fae7-ed7f-44b1-8345-54fc54f0710f/b.mkv', 'folder1/folder1.1/d831fae7-ed7f-44b1-8345-54fc54f0710f/d831fae7-ed7f-44b1-8345-54fc54f0710f/b.mkv', 'folder1/folder1.1/d831fae7-ed7f-44b1-8345-54fc54f0710f/d831fae7-ed7f-44b1-8345-54fc54f0710f/c.jpg']

这是第二个:

['', '', 'folder1', 'folder1/d.txt', 'folder1/folder1.1', 'folder1/folder1.1', 'folder1/folder1.1/a.mkv', 'folder1/folder1.1', 'folder1/folder1.1/b.mkv', 'folder1/folder1.1/b.mkv', 'folder1/folder1.1/c.jpg']

我不能只删除重复项,因为有些时候它们不是真正重复的,而是应该像我之前描述的那样重命名

2 个答案:

答案 0 :(得分:1)

我没有时间给出完整的实现,有很多方法可以做到这一点,但这里有一个大纲,可能会为你提供一个开始。 您的问题不是非常具体,除非您想要一个完整的解决方案,即您遇到的具体问题等等,但这里有:

import os
from sets import Set

def remove_uuids_from_path(path):
    # implement a function to remove uuids from the path name
    # use os.path.split or os.path.splitdrive and validate_uuid function
    # to build new paths without uuids

# build a set of source paths to work on
# or you can attempt to create the new path here
# and move files, as the file names are there in files list
source_paths = Set()
for root, dirs, files in os.walk(source_dir):
    source_paths.add(root)
    # edit: if you want file paths do:
    for file_name in files:
        file_name_path = os.path.join(root, file)

# run through source paths, replace source name with destination name
for s_path in source_paths:
  new_path = remove_uuids_from_path(s_path.replace(source_name,    destination_name))
  # create new directory if it doesn't allready exist
  if not os.path.exists(new_path):
      os.makedirs(new_path)
  # read file names from source directory into a list
  # move files to new directory if they don't exists there allready, else employ naming scheme.
  for new_path_to_file in file_name_list:
    if new_path_to_file.is_file():
        # change name by inspecting destination files names
    else:
       # move/copy file

答案 1 :(得分:0)

这是我的结束解决方案(谢谢Sushanta!)

#!/usr/bin/python
import os
import uuid
from collections import Counter
import shutil

args = {}
args['rootdirOriginal'] = "/Users/xxx/Desktop/UploadDropbox"
args['uuid'] = str(uuid.uuid4())
args['rootdir'] = args['rootdirOriginal']+"/"+args['uuid']

pathString = []
pathStringClean=[]
pathStringFolder = []
toDelete = []


# Validate if string is a valid UUID
def validate_uuid4(uuid_string):
    try:
        val = uuid.UUID(uuid_string, version=4)
    except ValueError:
        return False
    return True


# Walk through "source" directory
for dirpath, dirs, files in os.walk(args['rootdirOriginal']):
    # Files
    if files:
        for f in files:
            pathTmp = []
            pathFolderTmp = []
            pathRelative = os.path.relpath(dirpath, args['rootdirOriginal'])

            for p in pathRelative.split("/"): 
                pathTmp.append(p)
                pathFolderTmp.append(p)

            # Flag every fuke   
            pathTmp.append(f+"***") 

            # Delete list elements whose name are:  a previous UUID //// starting with "." //// starting with  "Icon"
            pathTmpClean = [x for x in pathTmp if not validate_uuid4(x) and x[0] != "." and x[0:4]!="Icon"]
            pathFolderTmpClean = [x for x in pathFolderTmp if not validate_uuid4(x) and x[0] != "." and x[0:4]!="Icon"]

            # Convert to path string
            if len(pathTmpClean) > 0:
                pathStringTmp = ("/").join(pathTmp)
                pathStringTmpFolder = ("/").join(pathFolderTmpClean)
                pathStringTmpClean = ("/").join(pathTmpClean)
                pathString.append(pathStringTmp)
                pathStringClean.append(pathStringTmpClean)
                if pathStringTmpFolder != "":
                    pathStringFolder.append(pathStringTmpFolder)

    # Empty directory               
    if dirs:
        for d in dirs:
            emptyDir = os.path.relpath(dirpath, args['rootdirOriginal'])
            if emptyDir == ".":
                emptyDir = d
            else:
                emptyDir =  os.path.join(emptyDir,d)

            pathStringFolder.append(emptyDir)

# Delete repeating directories          
pathStringFolder = list(set(pathStringFolder))

# Create a list with the indexes of first list when it is a directory 
for i in range(0,len(pathStringClean)):
    if (len(pathStringClean[i]) > 3):
        if pathStringClean[i][-3:] != "***":
            toDelete.append(i)

# Delete indexes of both "source" and "destinantion" lists where it is directory
for i in sorted(toDelete, reverse=True):
    del pathString[i]
    del pathStringClean[i]

# Delete the flag "***"
pathString = [x[:-3] for x in pathString]
pathStringClean = [x[:-3] for x in pathStringClean]

# Rename repeated filenames - create sequential suffix
counts = Counter(pathStringClean)
for s,num in counts.items():
    if num > 1:
        for suffix in range(1, num + 1):
            pathStringClean[pathStringClean.index(s)] = ("/").join(s.split("/")[:-1]) +("/")+(".").join((s.split("/")[-1]).split(".")[:-1])+"_"+str(suffix)+"."+(s.split("/")[-1]).split(".")[-1]

pathStringFolder.reverse()

# Create root "destination" directory
os.mkdir(args['rootdir'])

# Create other directories 
for folder in pathStringFolder:
    try:
        os.makedirs(os.path.join(args['rootdir'],folder))
    except:
        pass

# Move files
for i in range(0,len(pathStringClean)):
    os.rename(os.path.join(args['rootdirOriginal'],pathString[i]),os.path.join(args['rootdir'],pathStringClean[i]))

# Delete everything except root "destination" directory
for f in os.listdir(args['rootdirOriginal']):
    if (not (os.path.isfile(os.path.join( args['rootdirOriginal'],f))) and f != args['uuid']):                  
        shutil.rmtree(os.path.join( args['rootdirOriginal'],f))