Question

我有两个包含以下标题字符串的CSV文件：

文件1：

Product invoice,product verification,product not completed,product completed

文件2：
```
Product invoice,product completed
```

我需要找到常见的列：

    Product invoice,product completed

请注意，列在file1和file2中的显示顺序应相同。

Answer 1

从file1中的每个row[0]和row[2]格式化一组字符串，确切地说它在file2中的显示方式迭代在file2上查看该行是否出现在集合中：

import csv
with open(file1) as f1, open(file2) as f2:
    # skip headers
    next(f1),next(f2)
    r1 = csv.reader(f1)
    # make set of strings matching format of file2
    st = set("{},{}".format(row[0], row[2]) for row in r1)
    # iterate over every line in file2
    # and check if the line appears in the set
    for line in f2:
        if line.rstrip() in st:
            print(line)

File1中：

Product invoice,product verification,product completed
foo,1,2
bar,3,4
foo,foo,bar

文件2：

Product invoice,product completed
foo,2
bar,4
foobar,foo
bar,bar

输出：

 foo,2
 bar,4

如果您想要列表中的数据：

import csv
with open(file1) as f1, open(file2) as f2:
    r1 = csv.reader(f1)
    r2 = csv.reader(f2)
    st = set((row[0], row[2]) for row in r1)
    for row in r2:
        if tuple(row) in st:
            print(row)
['Product invoice', 'product completed']
['foo', '2']
['bar', '4']

Answer 2

你有

list1 = ["Product invoice", "product verification", "product completed"]
list2 = ["Product invoice", "product completed"]

想要比较list2中的元素是否在list1中，对吧？所以

def compare(list1, list2):
    list3 = []
    for elem in list1:
        if elem in list2:
            list3.append(elem)
    return list3==list2

该算法包括创建list3，其中list1中的元素位于list2中，然后比较结果列表是否等于list2。

如果它们的顺序相同，则只有True。另外，请记住“已完成产品”与“已完成产品”不同，因此您可能希望降低所有字符串（通过str.lower()）。

查找两个CSV文件之间的匹配列

2 个答案: