如何返回字母列表?
我有序列翻译器,还有一个读取dna和蛋白质序列的python代码。该代码读取dna序列并将其翻译为蛋白质序列,读取蛋白质序列,将其与翻译后的蛋白质序列进行比较,并打印出已读取的蛋白质序列中存在的蛋白质序列的列表。我该如何打印两种蛋白质中都存在的蛋白质?
def translate_codon(cod):
"""Translates a codon into an aminoacid using an internal dictionary with the standard genetic code."""
tc = {"GCT":"A", "GCC":"A", "GCA":"A", "GCG":"A",
"TGT":"C", "TGC":"C",
"GAT":"D", "GAC":"D",
"GAA":"E", "GAG":"E",
"TTT":"F", "TTC":"F",
"GGT":"G", "GGC":"G", "GGA":"G", "GGG":"G",
"CAT":"H", "CAC":"H",
"ATA":"I", "ATT":"I", "ATC":"I",
"AAA":"K", "AAG":"K",
"TTA":"L", "TTG":"L", "CTT":"L", "CTC":"L", "CTA":"L", "CTG":"L",
"ATG":"M", "AAT":"N", "AAC":"N",
"CCT":"P", "CCC":"P", "CCA":"P", "CCG":"P",
"CAA":"Q", "CAG":"Q",
"CGT":"R", "CGC":"R", "CGA":"R", "CGG":"R", "AGA":"R", "AGG":"R",
"TCT":"S", "TCC":"S", "TCA":"S", "TCG":"S", "AGT":"S", "AGC":"S",
"ACT":"T", "ACC":"T", "ACA":"T", "ACG":"T",
"GTT":"V", "GTC":"V", "GTA":"V", "GTG":"V",
"TGG":"W",
"TAT":"Y", "TAC":"Y",
"TAA":"_", "TAG":"_", "TGA":"_"}
if cod in tc:
return tc[cod]
else:
return '-1'
def seq_prot(dna_seq, ab):
seqm = dna_seq.upper()
prot = ab.upper()
seq_aa = ''
for pos in range(0, len(seqm)-2,3):
cod = seqm[pos:pos+3]
seq_aa += translate_codon(cod)
for p in seq_aa:
if p in prot:
seq_aa[p] += 1
else:
seq_aa = p
return seq_aa
dna_seq = "ACCCCTGTGACATACCTTTATGTTGCCTCGGCGGATCAGCCCGCGCCCC"
ab = 'TLYPAP'
print("The protein sequence are :",seq_prot(dna_seq, ab))
蛋白质序列为:TYPP
答案 0 :(得分:0)
您的代码被破坏了,因为它将 actions.append({
'update': {
'_index': self.index_name,
'_type': self.index_type,
'_id': _id,
'_routing': _routing
}
})
actions.append({'script': {
'source': """if(ctx._source.containsKey('s_error_word')){if(!ctx._source.d_topic.contains(params.error)){ctx._source.d_topic.add(params.error)}}else{ctx._source.d_topic=[params.error]}""",
'lang': 'painless',
'params': {
"error": sentence['error_char']
}
},
'upsert': {'s_error_word': sentence['error_char']}})
和seq_aa
都视为str
。让我们添加一个实际的字典来收集结果:
dict
输出
def seq_prot(dna_seq, ab):
sequence = dna_seq.upper()
protein = ab.upper()
matches = {}
for position in range(0, len(sequence), 3):
codon = sequence[position: position + 3]
aa = translate_codon(codon)
if aa in protein:
if aa in matches:
matches[aa] += 1
else:
matches[aa] = 1
return matches
dna_seq = "ACCCCTGTGACATACCTTTATGTTGCCTCGGCGGATCAGCCCGCGCCCC"
ab = 'TLYPAP'
print("The protein sequence matches are :", seq_prot(dna_seq, ab))
您可以在返回的The protein sequence matches are : {'T': 2, 'P': 3, 'Y': 2, 'L': 1, 'A': 3}
上使用.keys()
从中提取蛋白质。如果希望字母乘以值,则可以使用乘号(*)作为重复运算符。但是,任何 order 的感觉都已经消失了-我们只是在处理 existence 。如果您想保留订单,我们必须采取其他措施。