我有一个充满子文件夹的主文件夹,每个文件夹都包含具有特定命名方案的文件。我已经对基于这些文件中的信息在单个目录中创建和编辑文本文档的单元测试了一个函数,但现在遇到了试图让这个函数迭代每个子目录的问题。
问题:
我得到第38行if (row["r_id"]) in filters:
的“KeyError”。这是因为没有创建文件br_ids.csv
。在单元测试中,这个功能正常,所以我只能假设我使用os.walk
的方式存在问题。
代码:
import csv
import os
with open('hasf.txt','w') as hf:
for root, subFolders, files in os.walk('/path/to/topdir/'):
#if folder contains 'f_r.csv', list the path in 'hasf.txt'
if 'f_r.csv' in files:
hf.write("%s\n" % root)
if 'r.csv' in files:
with open(os.path.join(root, "r.csv")) as inf, open(os.path.join(root, "br_ids.csv"), "w") as output:
reader = csv.DictReader(inf, quotechar='"')
headers = ["r_id"]
writer_br = csv.DictWriter(output, headers, extrasaction='ignore')
writer_br.writeheader()
for row in reader:
if int(row["r_type"]) == 3:
writer_br.writerow(row)
# End creating br_ids
# parse the data you're about to filter with
with open(os.path.join(root, 'br_ids.csv'), 'r') as f:
filters = {(row["r_id"]) for row in csv.DictReader(f, delimiter=',', quotechar='"')}
with open(os.path.join(root, 'bt_ids.csv'), 'w') as out_f:
headers = ["t_id"]
out = csv.DictWriter(out_f, headers, extrasaction='ignore')
out.writeheader()
# go thru your rows and see if the matching(row[r_id]) is
# found in the previously parsed set of filters; if yes, skip the row
with open(os.path.join(root, 't.csv'), 'r') as f:
for row in csv.DictReader(f, delimiter=','):
if (row["r_id"]) in filters:
out.writerow(row)
我在这里遇到了一些类似的问题,但是他们都没有直接在os.walk
的每个位置内创建,编辑和使用文件。这是我第一次使用Python,我有些不知所措。此外,如果有任何方法可以让我的其他代码更加pythonic,我很满意。
谢谢!
答案 0 :(得分:0)
事实证明问题直接是KeyError - 在某些文件夹中,br_id.csv
没有条目,因此抛出了KeyError。我解决它的方式是try
,就像这样:
# parse the data you're about to filter with
with open(os.path.join(root, 'br_ids.csv'), 'r') as f:
filters = {(row["r_id"]) for row in csv.DictReader(f, delimiter=',', quotechar='"')}
with open(os.path.join(root, 'bt_ids.csv'), 'w') as out_f:
headers = ["t_id"]
out = csv.DictWriter(out_f, headers, extrasaction='ignore')
out.writeheader()
# go thru your rows and see if the matching(row[r_id]) is
# found in the previously parsed set of filters; if yes, skip the row
with open(os.path.join(root, 't.csv'), 'r') as f:
for row in csv.DictReader(f, delimiter=','):
try:
if (row["r_id"]) in filters:
out.writerow(row)
except KeyError:
continue
在另一个案例中,我有一个if (row["r_id"]) not in filters:
并使用相同的方法绕过了它,除非它返回KeyError
,然后继续执行out.writerow(row)
。