抱歉;我知道有一千个“制作独特列表”的主题。我试图自己解决这个问题,或者破解另一个“制作独特列表”的解决方案,但是我没有成功解决我那不可思议的蟒蛇技能。
我有一个视频文件名列表(这些是电影中的镜头)。对于任何给定的镜头,我想根据路径的一部分删除重复项(在下图中以红色圈出);只有 tk _ 值最高的那个应该最终列在最终列表中。
例如,在下图中,对于镜头 de05_001 ,只有 tk_3 应该在列表中结束。
输入(带有重复项):
raw_list = ['D:\\de05\\de05_001\\postvis\\tk_2\\blasts\\tb205_de05_001.POSTVIS.mov',
'D:\\de05\\de05_001\\postvis\\tk_3\\blasts\\tb205_de05_001.POSTVIS.mov',
'D:\\de05\\de05_002\\postvis\\tk_1\\blasts\\tb205_de05_002.POSTVIS.mov',
'D:\\de05\\de05_017\\postvis\\tk_2\\blasts\\tb205_de05_017.POSTVIS.mov',
'D:\\de05\\de05_019\\postvis\\tk_2\\blasts\\tb205_de05_019.POSTVIS.mov',
'D:\\de05\\de05_019\\postvis\\tk_3\\blasts\\tb205_de05_019.POSTVIS.mov',
'D:\\de05\\de05_019\\postvis\\tk_4\\blasts\\tb205_de05_019.POSTVIS.mov',
'D:\\de05\\de05_019\\postvis\\tk_1\\blasts\\tb205_de05_019.POSTVIS.mov', ]
输出(删除重复项,仅保留最高tk_数):
outputList = ['D:\\de05\\de05_001\\postvis\\tk_3\\blasts\\tb205_de05_001.POSTVIS.mov',
'D:\\de05\\de05_002\\postvis\\tk_1\\blasts\\tb205_de05_002.POSTVIS.mov',
'D:\\de05\\de05_017\\postvis\\tk_2\\blasts\\tb205_de05_017.POSTVIS.mov',
'D:\\de05\\de05_019\\postvis\\tk_4\\blasts\\tb205_de05_019.POSTVIS.mov', ]
任何帮助都会很棒。谢谢。
答案 0 :(得分:1)
一种方法是创建一个字典并继续重新分配密钥,这样你只能得到目录中的最后一个值:
import os
raw_list1 = [
'D:\\\\de05\\de05_019\\postvis\\tk_2\\blasts\\tb205_de05_019.POSTVIS.mov',
'D:\\\\de05\\de05_019\\postvis\\tk_3\\blasts\\tb205_de05_019.POSTVIS.mov',
'D:\\\\de05\\de05_019\\postvis\\tk_4\\blasts\\tb205_de05_019.POSTVIS.mov',
'D:\\\\de05\\de05_019\\postvis\\tk_1\\blasts\\tb205_de05_019.POSTVIS.mov',
'D:\\\\tw05\\tw05_036\\postvis\\tk_9\\blasts\\tb205_tw05_036.POSTVIS.mov',
'D:\\\\tw05\\tw05_036\\postvis\\tk_13\\blasts\\tb205_tw05_036.POSTVIS.mov'
]
raw_list2 = [
'D:\\de05\\de05_001\\postvis\\tk_2\\blasts\\tb205_de05_001.POSTVIS.mov',
'D:\\de05\\de05_001\\postvis\\tk_3\\blasts\\tb205_de05_001.POSTVIS.mov',
'D:\\de05\\de05_002\\postvis\\tk_1\\blasts\\tb205_de05_002.POSTVIS.mov',
'D:\\de05\\de05_017\\postvis\\tk_2\\blasts\\tb205_de05_017.POSTVIS.mov',
'D:\\de05\\de05_019\\postvis\\tk_2\\blasts\\tb205_de05_019.POSTVIS.mov',
'D:\\de05\\de05_019\\postvis\\tk_3\\blasts\\tb205_de05_019.POSTVIS.mov',
'D:\\de05\\de05_019\\postvis\\tk_4\\blasts\\tb205_de05_019.POSTVIS.mov',
'D:\\de05\\de05_019\\postvis\\tk_1\\blasts\\tb205_de05_019.POSTVIS.mov',
]
def path_split(p, folders=None):
folders = folders or []
head, tail = os.path.split(p)
if not tail:
return folders
return path_split(head, [tail] + folders)
for raw_list in (raw_list1, raw_list2):
results = {}
for p in raw_list:
# Split your path accordingly
# For something simple you could have just done s.split('\\'), but since we're working with paths, we might as well use os.path.split
shot1, shot2, folder1, take, folder2, file_name = path_split(p)
# If something like 'de05_019' defines your shot, make that the key
key = shot2
# Extract the take number into an integer
new_take_num = int(take.split('_')[-1])
# Try finding the take you already stored (default to Nones)
existing_take_num, existing_path = results.get(key, (None, None))
# See if the new take is bigger than the existing one, based on the take number.
# Lambda is there for comparison, meaning I'm only comparing the take numbers, not the paths. I'll link the docs to max in the comments.
value = max((existing_take_num, existing_path), (new_take_num, p), key=lambda take_num_and_path: take_num_and_path[0])
# Assign the value (which is either the existing take, or the new take)
results[key] = value
for res in sorted(results.values()):
print res
print '*' * 80
此输出(您也可以只print res[1]
打印路径):
(4, 'D:\\\\de05\\de05_019\\postvis\\tk_4\\blasts\\tb205_de05_019.POSTVIS.mov')
(13, 'D:\\\\tw05\\tw05_036\\postvis\\tk_13\\blasts\\tb205_tw05_036.POSTVIS.mov')
********************************************************************************
(1, 'D:\\de05\\de05_002\\postvis\\tk_1\\blasts\\tb205_de05_002.POSTVIS.mov')
(2, 'D:\\de05\\de05_017\\postvis\\tk_2\\blasts\\tb205_de05_017.POSTVIS.mov')
(3, 'D:\\de05\\de05_001\\postvis\\tk_3\\blasts\\tb205_de05_001.POSTVIS.mov')
(4, 'D:\\de05\\de05_019\\postvis\\tk_4\\blasts\\tb205_de05_019.POSTVIS.mov')
********************************************************************************