Biopython记录基因两侧的额外2个核苷酸+2,2个阅读框

时间:2017-02-11 17:02:25

标签: python-3.x bioinformatics biopython

我正在寻找埋伏终止密码子。我已经将我的代码提到了我从embl文件中提取所需序列的程度。但是我对如何添加两个上游核苷酸和两个下游核苷酸有点困惑,所以我最终得到-2,-1,0,1,2个阅读框。

for rec in SeqIO.parse("CP002701.embl", "embl"):
    if rec.features:
        for feature in rec.features:
            if feature.type == "CDS":
                print(feature.location)
                print(feature.qualifiers["protein_id"])
                print(feature.location.extract(rec).seq)      

是我要更改的部分,但不确定如何更改.location以选择我感兴趣的额外4个基数。

1 个答案:

答案 0 :(得分:0)

@ user7550844(OP)写于2017年2月12日15:46

mofoniusrex on reddit的一些帮助下,这是一个有效的解决方案:

for rec in SeqIO.parse("CP002701.embl", "embl"):
if rec.features:
    for feature in rec.features:
        if feature.type == "CDS":
            pad=2
            newloc = SeqFeature.FeatureLocation( feature.location.start - pad,feature.location.end + pad)
            newfeature=SeqFeature.SeqFeature(location=newloc,
                 type=feature.type,
                 strand=feature.strand,
                 ref=feature.ref,
                 ref_db=feature.ref_db)
            newfeature.qualifiers = feature.qualifiers
            print(newfeature.qualifiers)
            print(newfeature.location)
            print(newfeature.location.extract(rec).seq)