Question

我对Python很陌生，所以请原谅我缺乏知识。

我目前正在为磁带存档过滤器。我从磁带库中获取的是一个很长的csv列表，其中包含几个列，例如＆＃34; Path，Media，MD5等。＆＃34;

我写了一个小的过滤器脚本来读出只写在磁带上的文件（所有磁带都以K开头并写在媒体列中）

from tkinter import *
from tkinter.filedialog import askopenfilename

def callback():
 filename = askopenfilename()
# print(filename)
  return filename


errmsg = 'Error!'
Button(text='File Open', command=callback).pack(fill=X)

inputfile = callback()

import csv

def tapefilter():
with open(inputfile, 'r') as csvfile:
    reader = csv.DictReader(csvfile)

    for row in reader:
        if 'K' in row['Media'] and 'DT' in row['Path']:
            x = row['Path']
            x = x.replace("/Backups","")
            x = x.replace("/Volumes/", "")
            x = x.split("/")
            yield x

def output():
for x in tapefilter():
    print(x[0], x[2])

output()

我现在得到的是

K00130 20170504_DT12
K00130 20170505_DT13
K00130 20170508_DT14
K00130 20170508_DT14_part02
K00130 20170511_DT17
K00130 20170508_DT14_part03
K00130 20170508_DT14_part03
K00130 20170508_DT14_Masterfiles
K00130 20170508_DT14_part03
K00130 20170508_DT14_part03
K00130 20170508_DT14_part03
K00130 20170508_DT14_Masterfiles
K00130 20170508_DT14_part03
K00130 20170508_DT14_part03
K00130 20170508_DT14_part03
K00131 20170508_DT14_part03
K00131 20170508_DT14_part03
K00131 20170508_DT14_part03

我需要的是一个过滤器，它只输出磁带名称和不同的行。因此，磁带编号会改变路径的这个特定部分（整个路径要长得多）。

"/Volumes/K00131/Backups/Testproject/20170518_DT22/Alexa_ProRes/A032R72N",K00131,".cardmeta.xml"

到目前为止，我发现的所有过滤器脚本总是抛出＆＃34; non hashable＆＃34;错误。

最后，它应该是这样的：

K00130 20170504_DT12
K00130 20170505_DT13
K00130 20170508_DT14
K00130 20170508_DT14_part02
K00130 20170511_DT17
K00130 20170508_DT14_part03
K00130 20170508_DT14_Masterfiles
K00131 20170508_DT14_part03

我有点卡住：/

最好的问候

Answer 1

当你逐步浏览行时，你可以只存储前面的行，并将它们与新行进行比较：如果匹配，那么只需跳到下一行而不产生值：

def tapefilter():
    with open(inputfile, 'r') as csvfile:
        reader = csv.DictReader(csvfile)
        prev_values = [] # <- keep a list of previous (unique) values
        for row in reader:
            if 'K' in row['Media'] and 'DT' in row['Path']:
                x = row['Path']
                x = x.replace("/Backups", "")
                x = x.replace("/Volumes/", "")
                x = x.split("/")
                if (x[0], x[2]) in prev_values:
                    continue # <- don't yield if we've already seen this row
                else:
                    prev_values.append((x[0], x[2])) # <- update prev_values
                    yield x

结果：

K00130 20170504_DT12
K00130 20170505_DT13
K00130 20170508_DT14
K00130 20170508_DT14_part02
K00130 20170511_DT17
K00130 20170508_DT14_part03
K00130 20170508_DT14_Masterfiles
K00131 20170508_DT14_part03

Python 3从LTO磁带存档的不可用字典中过滤

1 个答案: