我有一个python代码,它比较两个文件并返回公共行并将它们写入结果文件。我正在使用MAC机器。
script.py
with open('temp1.csv', 'r') as file1:
with open('serialnumbers.txt', 'r') as file2:
same = set(file1).intersection(file2)
print same
with open('results.csv', 'w') as file_out:
for line in same:
file_out.write(line)
print line
temp1.csv
M11435TDS144
M11543TH4292
SN005
M11509TD9937
M11543TH4258
SN005
SN006
SN007
serialnumbers.txt
G1A114042400571
M11251TH1230
M11543TH4258
M11435TDS144
M11543TH4292
M11509TD9937
mac上面脚本的输出是
组([])
如果我在Windows上运行相同的脚本,它工作正常。我发现这是mac上的csv问题。我该如何解决这个问题?
答案 0 :(得分:2)
这两个文件的结尾分隔符是不同的。
因此,在二进制模式下,行总是不同的。
您应该以文本模式阅读这两个文件,如下所示:
import io
with io.open('temp1.csv', 'r') as file1:
with io.open('serialnumbers.txt', 'r') as file2:
same = set(file1).intersection(file2)
print(same)
你会得到:
set([u'M11543TH4258\n', u'M11509TD9937\n', u'M11543TH4292\n', u'M11435TDS144\n'])
另请注意,CSV文件通常使用ISO-8859-1或cp1252编码(Windows的传统编码)进行编码。
with io.open('temp1.csv', 'r') as file1:
with io.open('serialnumbers.txt', 'r') as file2:
same = set(line.strip() for line in file1).intersection(line.strip() for line in file2)
print(same)