使用CSV逐行处理2个文本文件

时间:2013-05-07 19:24:41

标签: python loops csv

我有两个文本文件。一个有大约100行(A),另一个可能有~800行(B)。

我希望从A读取一行,然后从B读取所有行,然后打印一行,其中包含每个文件的值。

我正在使用python的csv模块,因为我知道这些文件格式和内容,它们都是以逗号分隔的值。

我的代码看起来像这样......

import csv

infile1 = r'C:\zData\a.txt'
infile2 = r'C:\zData\b.txt'

csvfile1  = open(infile1, 'r')
myreader1 = csv.DictReader(csvfile1)

csvfile2  = open(infile2, 'r')
myreader2 = csv.DictReader(csvfile2)

for row1 in myreader1:

    for row2 in myreader2:

        print "GID = " + row1['GID'] + ", ABC = " + row2['ABC']

我怀疑这是一个简单的问题,但由于某种原因,此代码只读取外部循环的第一行(infile1)和内部循环的所有行(infile2)。

我做错了什么?我尝试添加myreader1.next,这似乎没什么区别。

感谢。

2 个答案:

答案 0 :(得分:2)

您只能一次,可以遍历csv.readercsv.DictReader对象;然后文件指针位于文件的末尾。

您应该只读取第一个文件(较小的文件)中的所有行到保存在内存中的列表:

with  open(infile1, 'r') as csvfile1:
    rows1 = list(csv.DictReader(csvfile1))

现在,您可以根据需要循环遍历该列表:

with open(infile2, 'r') as csvfile2:
    myreader2 = csv.DictReader(csvfile2)

    for row1 in myreader2:
        for row2 in rows1:
            print "GID = " + row1['GID'] + ", ABC = " + row2['ABC']

另一种方法是每次在循环内重新打开myreader2

with open(infile1, 'r') as csvfile1:
    myreader1 = csv.DictReader(csvfile1)
    for row1 in myreader1:
        with open(infile2, 'r') as csvfile2:
            myreader2 = csv.DictReader(csvfile2)

            for row2 in myreader2:
                print "GID = " + row1['GID'] + ", ABC = " + row2['ABC']

但是,如果您需要在两个文件之间显示匹配项,请将第一个文件读入字典:

with  open(infile1, 'r') as csvfile1:
    rows1 = {row['GID']: row for row in csv.DictReader(csvfile1)}

现在rows1是将GID个键映射到列出该值的行的字典。这假设每行都有唯一的GID值。

这样可以轻松地将行与第二个CSV文件中的信息进行匹配:

with open(infile2, 'r') as csvfile2:
    myreader2 = csv.DictReader(csvfile2)

    for row in myreader2:
        if row['GID'] in rows1:
            print 'Matching GID {}!'.format(row['GID'])
            print 'infile1: {}'.format(rows1[row['GID']])
            print 'infile2: {}'.format(row)

答案 1 :(得分:0)

这个怎么样

f = ["gid={}, abc={}".format(x['gid'],y['abc']) for y in myreader2 for x in myreader1]
print f