我正在尝试比较两个csv文件并期望此输出但无法成功。这是我的示例和代码:
File1.csv
meNOG00110,9606.ENSP00000349259,1,2364
meNOG06332,9606.ENSP00000344967,1,322
meNOG06773,9606.ENSP00000344961,1,379
meNOG03133,9606.ENSP00000387429,1,2089
meNOG17468,9606.ENSP00000217169,1,298
File2.csv
meNOG06332,9606.ENSP00000344967,1,322
meNOG00110,9606.ENSP00000349259,1,2364
meNOG00110,9606.ENSP00000357130,1,2419
meNOG00018,10090.ENSMUSP00000027367,1,261
meNOG00018,10090.ENSMUSP00000072852,1,276
output.txt的
meNOG06332 9606.ENSP00000344967 1 322
meNOG00110 9606.ENSP00000349259 1 2364
meNOG00018 10090.ENSMUSP00000027367 1 261
meNOG00018 10090.ENSMUSP00000072852 1 276
代码:
file1 = open("File1.csv", "rU")
reader1 = csv.reader(file1,delimiter=',')
file2 = open("File2.csv", "rU")
reader2 = csv.reader(file2,delimiter=',')
for row2 in reader2:
for row1 in reader1:
if row2[1].startswith('9606'):
if row2[1] == row1[1]:
print row2
else:
print row2
但是这段代码只搜索第一行。
答案 0 :(得分:0)
我不确定这正是你想要的,但因为那不清楚:
如果您要查找两个文件之间的重叠并且想要比较整行,则可以创建两个集合(每个文件一个)并输出交集:
with open('File1.csv', 'r') as infile1,
open('File2.csv', 'r') as infile2,
open('File3.csv', 'w') as outfile:
lines1 = set(infile1)
lines2 = set(infile2)
writer = csv.writer(outfile, delimiter=',')
for line in (lines1 & lines2):
writer.writerow(line)
答案 1 :(得分:0)
我不确定您期望的结果格式,但是为了比较两个文件,您可以使用标准的python模块:
http://docs.python.org/2/library/difflib.html
您可以根据需要分析输出和格式
答案 2 :(得分:0)
您可以将两个文件压缩在一起:
with open(path_a, 'r') as a, open(path_b, 'r') as b:
for line_a, line_b in zip(a, b):
print line_a, line_b
如果第一个文件是:
a
s
d
f
,第二个文件是:
q
w
e
r
输出将是:
a q
s w
d e
f r