Python:唯一列表(基于文件路径的一部分)

时间:2016-06-15 21:09:50

标签: python list

抱歉;我知道有一千个“制作独特列表”的主题。我试图自己解决这个问题,或者破解另一个“制作独特列表”的解决方案,但是我没有成功解决我那不可思议的蟒蛇技能。

我有一个视频文件名列表(这些是电影中的镜头)。对于任何给定的镜头,我想根据路径的一部分删除重复项(在下图中以红色圈出);只有 tk _ 值最高的那个应该最终列在最终列表中。

例如,在下图中,对于镜头 de05_001 ,只有 tk_3 应该在列表中结束。

enter image description here

输入(带有重复项):

raw_list = ['D:\\de05\\de05_001\\postvis\\tk_2\\blasts\\tb205_de05_001.POSTVIS.mov', 
'D:\\de05\\de05_001\\postvis\\tk_3\\blasts\\tb205_de05_001.POSTVIS.mov', 
'D:\\de05\\de05_002\\postvis\\tk_1\\blasts\\tb205_de05_002.POSTVIS.mov', 
'D:\\de05\\de05_017\\postvis\\tk_2\\blasts\\tb205_de05_017.POSTVIS.mov', 
'D:\\de05\\de05_019\\postvis\\tk_2\\blasts\\tb205_de05_019.POSTVIS.mov', 
'D:\\de05\\de05_019\\postvis\\tk_3\\blasts\\tb205_de05_019.POSTVIS.mov', 
'D:\\de05\\de05_019\\postvis\\tk_4\\blasts\\tb205_de05_019.POSTVIS.mov', 
'D:\\de05\\de05_019\\postvis\\tk_1\\blasts\\tb205_de05_019.POSTVIS.mov', ]

输出(删除重复项,仅保留最高tk_数):

outputList = ['D:\\de05\\de05_001\\postvis\\tk_3\\blasts\\tb205_de05_001.POSTVIS.mov', 
'D:\\de05\\de05_002\\postvis\\tk_1\\blasts\\tb205_de05_002.POSTVIS.mov', 
'D:\\de05\\de05_017\\postvis\\tk_2\\blasts\\tb205_de05_017.POSTVIS.mov', 
'D:\\de05\\de05_019\\postvis\\tk_4\\blasts\\tb205_de05_019.POSTVIS.mov', ]

任何帮助都会很棒。谢谢。

1 个答案:

答案 0 :(得分:1)

一种方法是创建一个字典并继续重新分配密钥,这样你只能得到目录中的最后一个值:

import os

raw_list1 = [
    'D:\\\\de05\\de05_019\\postvis\\tk_2\\blasts\\tb205_de05_019.POSTVIS.mov',
    'D:\\\\de05\\de05_019\\postvis\\tk_3\\blasts\\tb205_de05_019.POSTVIS.mov',
    'D:\\\\de05\\de05_019\\postvis\\tk_4\\blasts\\tb205_de05_019.POSTVIS.mov',
    'D:\\\\de05\\de05_019\\postvis\\tk_1\\blasts\\tb205_de05_019.POSTVIS.mov',
    'D:\\\\tw05\\tw05_036\\postvis\\tk_9\\blasts\\tb205_tw05_036.POSTVIS.mov',
    'D:\\\\tw05\\tw05_036\\postvis\\tk_13\\blasts\\tb205_tw05_036.POSTVIS.mov'
]
raw_list2 = [
    'D:\\de05\\de05_001\\postvis\\tk_2\\blasts\\tb205_de05_001.POSTVIS.mov',
    'D:\\de05\\de05_001\\postvis\\tk_3\\blasts\\tb205_de05_001.POSTVIS.mov',
    'D:\\de05\\de05_002\\postvis\\tk_1\\blasts\\tb205_de05_002.POSTVIS.mov',
    'D:\\de05\\de05_017\\postvis\\tk_2\\blasts\\tb205_de05_017.POSTVIS.mov',
    'D:\\de05\\de05_019\\postvis\\tk_2\\blasts\\tb205_de05_019.POSTVIS.mov',
    'D:\\de05\\de05_019\\postvis\\tk_3\\blasts\\tb205_de05_019.POSTVIS.mov',
    'D:\\de05\\de05_019\\postvis\\tk_4\\blasts\\tb205_de05_019.POSTVIS.mov',
    'D:\\de05\\de05_019\\postvis\\tk_1\\blasts\\tb205_de05_019.POSTVIS.mov',
]

def path_split(p, folders=None):
    folders = folders or []
    head, tail = os.path.split(p)
    if not tail:
        return folders
    return path_split(head, [tail] + folders)

for raw_list in (raw_list1, raw_list2):
    results = {}

    for p in raw_list:
        # Split your path accordingly
        # For something simple you could have just done s.split('\\'), but since we're working with paths, we might as well use os.path.split
        shot1, shot2, folder1, take, folder2, file_name = path_split(p)
        # If something like 'de05_019' defines your shot, make that the key
        key = shot2
        # Extract the take number into an integer
        new_take_num = int(take.split('_')[-1])
        # Try finding the take you already stored (default to Nones)
        existing_take_num, existing_path = results.get(key, (None, None))
        # See if the new take is bigger than the existing one, based on the take number.
        # Lambda is there for comparison, meaning I'm only comparing the take numbers, not the paths. I'll link the docs to max in the comments.
        value = max((existing_take_num, existing_path), (new_take_num, p), key=lambda take_num_and_path: take_num_and_path[0])
        # Assign the value (which is either the existing take, or the new take)
        results[key] = value

    for res in sorted(results.values()):
        print res
    print '*' * 80

此输出(您也可以只print res[1]打印路径):

(4, 'D:\\\\de05\\de05_019\\postvis\\tk_4\\blasts\\tb205_de05_019.POSTVIS.mov')
(13, 'D:\\\\tw05\\tw05_036\\postvis\\tk_13\\blasts\\tb205_tw05_036.POSTVIS.mov')
********************************************************************************
(1, 'D:\\de05\\de05_002\\postvis\\tk_1\\blasts\\tb205_de05_002.POSTVIS.mov')
(2, 'D:\\de05\\de05_017\\postvis\\tk_2\\blasts\\tb205_de05_017.POSTVIS.mov')
(3, 'D:\\de05\\de05_001\\postvis\\tk_3\\blasts\\tb205_de05_001.POSTVIS.mov')
(4, 'D:\\de05\\de05_019\\postvis\\tk_4\\blasts\\tb205_de05_019.POSTVIS.mov')
********************************************************************************