我有2个CSV文件,其中包含3列,分别为“ num,” date”和“ tex”。
File1
num date tex
20170512 12/05/2017 15:39 1001
20170512 12/05/2017 15:39 1001
20170908 08/09/2017 02:42 1001
20170908 08/09/2017 06:30 1001
文件2
num date tex
201705332212 12/05/2017 15:39 1001
20170523212 12/05/2017 15:39 100156
2017232320908 08/09/2017 02:42 10012
20170908 08/09/2017 06:30 1001
所需的输出
diff.csv
num date tex
201705332212 12/05/2017 15:39 1001
20170523212 12/05/2017 15:39 100156
2017232320908 08/09/2017 02:42 10012
我要同时匹配列“ num”和“ tex”。当前,下面的代码仅检查整个文件中的差异,而不是针对列“ num”和“ tex”。理想情况下,我希望“ num”和“ tex”两列都不相同时,我希望将其写入out.csv文件。
答案 0 :(得分:1)
使用csv
模块。
例如:
import csv
with open("file1.csv","rU") as file_0, open("file2.csv","rU") as file_1, open("out.csv", "w") as out_file:
file_0 = csv.reader(file_0, delimiter=";")
file_1 = csv.reader(file_1, delimiter=";")
next(file_0) #Skip Header
out_file_writer = csv.writer(out_file, delimiter=";")
out_file_writer.writerow(next(file_1)) #Writer Header
for k, v in zip(file_0, file_1):
if (k[0] != v[0]) or (k[-1] != v[-1]):
out_file_writer.writerow(v) #Writer Diff