比较python中不同的CSV文件

时间:2013-07-29 19:10:26

标签: python csv bioinformatics

假设我有2个CSV文件:

文件1:

Epitope Name,Epitope,Protein,position,position

3606,NSRSTSLSV,FOO,10,21

文件2:

A,B,C,D,E,F,G,H,I,J,K

0,1,2,3,4,5,6,7,8,9,NSRSTSLSV

基本上,我想查看文件1中第1行的内容是否在文件2的第10行中找到。如果内容匹配,我将打印第3个csv,它是文件1的新版本,带有一列说找到或未找到。

现在,我找不到任何东西,我知道不是这样。在某些情况下,文件1中的文本可以在文件2的较大文本块中找到。

这是我到目前为止所做的(改编自前面的答案):

#usr/bin/python2.4

import csv

f1 = file ('all_epitopes.csv', 'rb')
f2 = file ('positiveBcell.csv', 'rb')
f3 = file ('results.csv', 'w')

c1 = csv.reader((f1), delimiter=",", quotechar='"')
c2 = csv.reader((f2), delimiter=",", quotechar='"')
c3 = csv.writer((f3), delimiter=",", quotechar='"')


positiveBcell = [row for row in c2]

for all_epitopes_row in c1:
    row = 1
    found = False
    for master_row in positiveBcell:
        results_row = all_epitopes_row
        if all_epitopes_row[2] == positiveBcell[10]:
            results_row.append('FOUND in Bcell List (row ' + str(row) + ')')
            found = True
            break
        row = row +1
    if not found:
        results_row.append('NOT FOUND in Bcell list')
    c3.writerow(results_row)

f1.close()
f2.close()
f3.close()

1 个答案:

答案 0 :(得分:0)

假设你的两个文件

文件1:

Epitope Name,Epitope,Protein,position,position

#Row 1#
3606,NSRSTSLSV,FOO,10,21

文件2:

A,B,C,D,E,F,G,H,I,J,K

#Row 10#
0,1,2,3,4,5,6,7,8,9,NSRSTSLSV

OP发表评论后:

for all_epitopes_row in c1:
    row = 1
    found = False
    for master_row in positiveBcell:
        results_row = all_epitopes_row
        **if all_epitopes_row[2] == master_row[10]:**
            results_row.append('FOUND in Bcell List (row ' + str(row) + ')')
            found = True
            break
        row = row +1
    if not found:
        results_row.append('NOT FOUND in Bcell list')
    c3.writerow(results_row)