Question

我想做的是在基因组文件中以小写形式生成GenBank记录的所有非假定序列。

到目前为止，我设法获得gbk中蛋白质的起始和终止位置。从那里我做了以下几点：

start = feature.location.nofuzzy_start
end = feature.location.nofuzzy_end
gb_record.seq[start:end]

现在我在基因组中有序列的起始和终点位置。但是我该如何修改基因组文件？ gb_record.seq[start:end].lower()或类似的东西没有做到这一点。

当我指定gb_record.seq = gb_record.seq[start:end].lower时，我替换基因组文件时显然会出错。有什么想法吗？

Answer 1

Bio.Seq.Seq个对象有一个lower()方法，可以完成你想要的工作。

解决你的代码，你会得到：

seq_lower = gb_record.seq.lower()

然后，您应该能够使用SeqIO模块写出要归档的小写序列。

from Bio import SeqIO

with open("example.fasta", 'w') as handle:
    SeqIO.write(lower_records, handle, 'fasta')