Question

我是新手，我正在尝试创建一个函数，该函数接收两个xml文件并找到互惠的最佳匹配（如果specieA中的某个蛋白质与specieB中的另一个蛋白质最匹配，反之亦然然后他们是基于他们的总分的互惠最佳匹配。我希望有人可以帮助我，因为我不知道从哪里开始。

record1=NCBIXML.parse(open(filename1))
record2=NCBIXML.parse(open(filename2))

for record in record1:
    query_id1=record.query_id

    for alignment in record.alignments:
        total_score1=0

        for hsp in alignment.hsps:
            total_score1 += hsp.bits

Answer 1

我这样做是为了找到直系同源基因：

爆炸A反对B.

在dict中解析并保存最佳匹配：

# "A1_prot" comes from the query and "B1_prot" from the subject
matches = {"A1_prot": "B1_prot",
           "A2_prot": "B2_prot"}

爆炸B对抗A.

解析此输出，同时使用结果查询上一个dict：

# Now "A1_prot" comes from the subject and "B1_prot" is the query
if matches["A1_prot"] == "B1_prot":
    orthologous.append(("A1_prot", "B1_prot"))

在blast输出文件中获得互惠最佳匹配

1 个答案: