查找两个CSV文件之间的匹配列

时间:2015-04-26 19:53:18

标签: python string

我有两个包含以下标题字符串的CSV文件:

  • 文件1:

    Product invoice,product verification,product not completed,product completed
    
  • 文件2:

    Product invoice,product completed
    

我需要找到常见的列:

    Product invoice,product completed

请注意,列在file1和file2中的显示顺序应相同。

2 个答案:

答案 0 :(得分:1)

从file1中的每个row[0]row[2]格式化一组字符串,确切地说它在file2中的显示方式迭代在file2上查看该行是否出现在集合中:

import csv
with open(file1) as f1, open(file2) as f2:
    # skip headers
    next(f1),next(f2)
    r1 = csv.reader(f1)
    # make set of strings matching format of file2
    st = set("{},{}".format(row[0], row[2]) for row in r1)
    # iterate over every line in file2
    # and check if the line appears in the set
    for line in f2:
        if line.rstrip() in st:
            print(line)

File1中:

Product invoice,product verification,product completed
foo,1,2
bar,3,4
foo,foo,bar

文件2:

Product invoice,product completed
foo,2
bar,4
foobar,foo
bar,bar

输出:

 foo,2
 bar,4

如果您想要列表中的数据:

import csv
with open(file1) as f1, open(file2) as f2:
    r1 = csv.reader(f1)
    r2 = csv.reader(f2)
    st = set((row[0], row[2]) for row in r1)
    for row in r2:
        if tuple(row) in st:
            print(row)
['Product invoice', 'product completed']
['foo', '2']
['bar', '4']

答案 1 :(得分:0)

你有

list1 = ["Product invoice", "product verification", "product completed"]
list2 = ["Product invoice", "product completed"]

想要比较list2中的元素是否在list1中,对吧?所以

def compare(list1, list2):
    list3 = []
    for elem in list1:
        if elem in list2:
            list3.append(elem)
    return list3==list2

该算法包括创建list3,其中list1中的元素位于list2中,然后比较结果列表是否等于list2。

如果它们的顺序相同,则只有True。 另外,请记住“已完成产品”与“已完成产品”不同,因此您可能希望降低所有字符串(通过str.lower())。