这个脚本比较制表符分隔的csv文件缺少什么?

时间:2017-02-01 16:04:35

标签: python csv dictionary reader

我正在编写一个python脚本来比较csv文件。但是它仅适用于逗号分隔,即使分隔符设置为\ t ...

d='\t'

for x in range(0, columns):
    with open(mfile, 'rb') as master:       
        with open(cfile, 'rb') as check:
            master_indices = dict((r[x], i) for i, r in enumerate(csv.reader(master, delimiter=d))) 
            check_reader = csv.reader(check, delimiter=d)

            for row in check_reader:
                index = master_indices.get(row[x])

                if index is not None:
                    T += 1
                    matches += 1
                else:
                    T += 1

编辑:

测试案例1 ......

M文件:

a,1
a,2

的CFile:

x,2
x,z

与d =','

读取两列并返回1匹配,T为4。

测试案例2 ......

M文件:

a    1
a    2

的CFile:

x    2
x    z

与d =' \ t'

读取第1列返回0匹配,T为2。

修改:使用提供的,工作,并接受答案:

for x in range(0, columns):
    with open(mfile, 'rb') as master:
        dialect = csv.Sniffer().sniff(master.read(1024))
        master.seek(0)
        master_reader = csv.reader(master, dialect)

        with open(cfile, 'rb') as check:
            dialect = csv.Sniffer().sniff(check.read(1024))
            check.seek(0)
            check_reader = csv.reader(check, dialect)

            master_indices = dict((r[x], i) for i, r in enumerate(master_reader)) 

            for row in check_reader:
                index = master_indices.get(row[x])

                if index is not None:
                    T += 1
                    matches += 1
                else:
                    T += 1

1 个答案:

答案 0 :(得分:1)

您可以使用csv.Sniffer获取csv文件的方言:

with open(mfile, 'rb') as master:
    dialect = csv.Sniffer().sniff(master.read(1024))
    master.seek(0)
    master_reader = csv.reader(master, dialect)

    with open(cfile, 'rb') as check:
        dialect = csv.Sniffer().sniff(check.read(1024))
        check.seek(0)
        check_reader = csv.reader(check, dialect)