Python 3从LTO磁带存档的不可用字典中过滤

时间:2018-05-22 08:32:07

标签: python-3.x dictionary filter

我对Python很陌生,所以请原谅我缺乏知识。

我目前正在为磁带存档过滤器。我从磁带库中获取的是一个很长的csv列表,其中包含几个列,例如" Path,Media,MD5等。"

我写了一个小的过滤器脚本来读出只写在磁带上的文件(所有磁带都以K开头并写在媒体列中)

from tkinter import *
from tkinter.filedialog import askopenfilename

def callback():
 filename = askopenfilename()
# print(filename)
  return filename


errmsg = 'Error!'
Button(text='File Open', command=callback).pack(fill=X)

inputfile = callback()

import csv

def tapefilter():
with open(inputfile, 'r') as csvfile:
    reader = csv.DictReader(csvfile)

    for row in reader:
        if 'K' in row['Media'] and 'DT' in row['Path']:
            x = row['Path']
            x = x.replace("/Backups","")
            x = x.replace("/Volumes/", "")
            x = x.split("/")
            yield x

def output():
for x in tapefilter():
    print(x[0], x[2])

output()

我现在得到的是

K00130 20170504_DT12
K00130 20170505_DT13
K00130 20170508_DT14
K00130 20170508_DT14_part02
K00130 20170511_DT17
K00130 20170508_DT14_part03
K00130 20170508_DT14_part03
K00130 20170508_DT14_Masterfiles
K00130 20170508_DT14_part03
K00130 20170508_DT14_part03
K00130 20170508_DT14_part03
K00130 20170508_DT14_Masterfiles
K00130 20170508_DT14_part03
K00130 20170508_DT14_part03
K00130 20170508_DT14_part03
K00131 20170508_DT14_part03
K00131 20170508_DT14_part03
K00131 20170508_DT14_part03

我需要的是一个过滤器,它只输出磁带名称和不同的行。因此,磁带编号会改变路径的这个特定部分(整个路径要长得多)。

"/Volumes/K00131/Backups/Testproject/20170518_DT22/Alexa_ProRes/A032R72N",K00131,".cardmeta.xml"

到目前为止,我发现的所有过滤器脚本总是抛出" non hashable"错误。

最后,它应该是这样的:

K00130 20170504_DT12
K00130 20170505_DT13
K00130 20170508_DT14
K00130 20170508_DT14_part02
K00130 20170511_DT17
K00130 20170508_DT14_part03
K00130 20170508_DT14_Masterfiles
K00131 20170508_DT14_part03

我有点卡住:/

最好的问候

1 个答案:

答案 0 :(得分:0)

当你逐步浏览行时,你可以只存储前面的行,并将它们与新行进行比较:如果匹配,那么只需跳到下一行而不产生值:

def tapefilter():
    with open(inputfile, 'r') as csvfile:
        reader = csv.DictReader(csvfile)
        prev_values = [] # <- keep a list of previous (unique) values
        for row in reader:
            if 'K' in row['Media'] and 'DT' in row['Path']:
                x = row['Path']
                x = x.replace("/Backups", "")
                x = x.replace("/Volumes/", "")
                x = x.split("/")
                if (x[0], x[2]) in prev_values:
                    continue # <- don't yield if we've already seen this row
                else:
                    prev_values.append((x[0], x[2])) # <- update prev_values
                    yield x

结果:

K00130 20170504_DT12
K00130 20170505_DT13
K00130 20170508_DT14
K00130 20170508_DT14_part02
K00130 20170511_DT17
K00130 20170508_DT14_part03
K00130 20170508_DT14_Masterfiles
K00131 20170508_DT14_part03