我对Python很陌生,所以请原谅我缺乏知识。
我目前正在为磁带存档过滤器。我从磁带库中获取的是一个很长的csv列表,其中包含几个列,例如" Path,Media,MD5等。"
我写了一个小的过滤器脚本来读出只写在磁带上的文件(所有磁带都以K开头并写在媒体列中)
from tkinter import *
from tkinter.filedialog import askopenfilename
def callback():
filename = askopenfilename()
# print(filename)
return filename
errmsg = 'Error!'
Button(text='File Open', command=callback).pack(fill=X)
inputfile = callback()
import csv
def tapefilter():
with open(inputfile, 'r') as csvfile:
reader = csv.DictReader(csvfile)
for row in reader:
if 'K' in row['Media'] and 'DT' in row['Path']:
x = row['Path']
x = x.replace("/Backups","")
x = x.replace("/Volumes/", "")
x = x.split("/")
yield x
def output():
for x in tapefilter():
print(x[0], x[2])
output()
我现在得到的是
K00130 20170504_DT12
K00130 20170505_DT13
K00130 20170508_DT14
K00130 20170508_DT14_part02
K00130 20170511_DT17
K00130 20170508_DT14_part03
K00130 20170508_DT14_part03
K00130 20170508_DT14_Masterfiles
K00130 20170508_DT14_part03
K00130 20170508_DT14_part03
K00130 20170508_DT14_part03
K00130 20170508_DT14_Masterfiles
K00130 20170508_DT14_part03
K00130 20170508_DT14_part03
K00130 20170508_DT14_part03
K00131 20170508_DT14_part03
K00131 20170508_DT14_part03
K00131 20170508_DT14_part03
我需要的是一个过滤器,它只输出磁带名称和不同的行。因此,磁带编号会改变路径的这个特定部分(整个路径要长得多)。
"/Volumes/K00131/Backups/Testproject/20170518_DT22/Alexa_ProRes/A032R72N",K00131,".cardmeta.xml"
到目前为止,我发现的所有过滤器脚本总是抛出" non hashable"错误。
最后,它应该是这样的:
K00130 20170504_DT12
K00130 20170505_DT13
K00130 20170508_DT14
K00130 20170508_DT14_part02
K00130 20170511_DT17
K00130 20170508_DT14_part03
K00130 20170508_DT14_Masterfiles
K00131 20170508_DT14_part03
我有点卡住:/
最好的问候
答案 0 :(得分:0)
当你逐步浏览行时,你可以只存储前面的行,并将它们与新行进行比较:如果匹配,那么只需跳到下一行而不产生值:
def tapefilter():
with open(inputfile, 'r') as csvfile:
reader = csv.DictReader(csvfile)
prev_values = [] # <- keep a list of previous (unique) values
for row in reader:
if 'K' in row['Media'] and 'DT' in row['Path']:
x = row['Path']
x = x.replace("/Backups", "")
x = x.replace("/Volumes/", "")
x = x.split("/")
if (x[0], x[2]) in prev_values:
continue # <- don't yield if we've already seen this row
else:
prev_values.append((x[0], x[2])) # <- update prev_values
yield x
结果:
K00130 20170504_DT12
K00130 20170505_DT13
K00130 20170508_DT14
K00130 20170508_DT14_part02
K00130 20170511_DT17
K00130 20170508_DT14_part03
K00130 20170508_DT14_Masterfiles
K00131 20170508_DT14_part03