python比较两个csv文件中的dict reader元素

时间:2015-02-06 19:50:15

标签: python csv dictionary compare

我有两个我想要比较的CSV文件。我用dict阅读器读过它们。所以现在我有两个CSV文件的字典(每行一个)。我想比较它们,比如当两个元素(标题为h1和h2的元素)相同时,比较这些字典并打印出与第二个字典相关的差异。以下是示例csv文件。

csv1:

h1,h2,h3
aaa,g0,74
bjg,73,kg9

CSV_new:

h1,h2,h3,h4
aaa,g0,7,
bjg,73,kg9,ahf

我希望输出是这样的,尽管不完全如下所示,我希望它能够打印出与CSV_new相关的每个字典中的修改,添加和删除:

{h1:'aaa', h2:'g0' {h3:'74', h4:''}}
{h1:'bjg', h2:'73' {h4:''}

我的代码,还不够发达。

import csv
f1 = "csv1.csv"
reader1 = csv.DictReader(open (f1), delimiter = ",")
for row1 in reader1:
    row1['h1']
#['%s:%s' % (f, row[f]) for f in reader.fieldnames]
f2 = "CSV_new.csv"
reader2 = csv.DictReader(open (f2), delimiter = ",")
for row2 in reader2:
    row2['h1']
if row1['h1'] == row2['h1']:
    print row1, row2

2 个答案:

答案 0 :(得分:1)

如果您只想找到差异,可以使用difflib 例如: import difflib fo1 = open(csv) fo2 = open(CSV_new) diff =difflib.ndiff(fo1.readlines(),fo2.readlines()) 然后你可以根据需要写出差异

答案 1 :(得分:0)

这可能是您正在寻找的,但如上所述,您的描述中存在一些含糊之处。

with open(A) as fd1, open(B) as fd2:
    a, b = csv.reader(fd1), csv.reader(fd2)
    ha, hb = next(a), next(b)
    if not set(ha).issubset(set(hb)):
        sys.exit(1)

    lookup = {label : (key, hb.index(label)) for key, label in enumerate(ha)}
    for rowa, rowb in zip(a, b):
        for key in lookup:
            index_a, index_b = lookup[key]
            if rowa[index_a] != rowb[index_b]:
                 print(rowb)
                 break