所以我的翻译工作得很好,但是当我通过断言检查时,它没有传递错误说:它应该是字符串而不是元组。我遇到了问题,但我不知道如何解决它。
AssertionError:
<class 'tuple'> != <class 'str'>
def frequency(dna_sequence):
'''
takes a DNA sequence (in string format) as input, parses it into codons using parse_sequence(),
counts each type of codon and returns the codons' frequency as a dictionary of counts;
the keys of the dictionary must be in string format
'''
codon_freq = dict()
# split string with parse_sequence()
parsed = parse_sequence(dna_sequence) # it's a function made previously, which actually makes a sequence of string to one-element tuple.
# count each type of codons in DNA sequence
from collections import Counter
codon_freq = Counter(parsed)
return codon_freq
codon_freq1 = codon_usage(dna_sequence1)
print("Sequence 1 Codon Frequency:\n{0}".format(codon_freq1))
codon_freq2 = codon_usage(dna_sequence2)
print("\nSequence 2 Codon Frequency:\n{0}".format(codon_freq2))
断言检查
assert_equal(codon_usage('ATATTAAAGAATAATTTTATAAAAATATGT'),
{'AAA': 1, 'AAG': 1, 'AAT': 2, 'ATA': 3, 'TGT': 1, 'TTA': 1, 'TTT': 1})
assert_equal(type((list(codon_frequency1.keys()))[0]), str)
关于parse_sequence:
def parse_sequence(dna_sequence):
codons = []
if len(dna_sequence) % 3 == 0:
for i in range(0,len(dna_sequence),3):
codons.append((dna_sequence[i:i + 3],))
return codons
答案 0 :(得分:3)
您可能会发现直接使用Counter更容易理解。 e.g。
>>> s = 'ATATTAAAGAATAATTTTATAAAAATATGT'
>>> [s[3*i:3*i+3] for i in xrange(0, len(s)/3)]
['ATA', 'TTA', 'AAG', 'AAT', 'AAT', 'TTT', 'ATA', 'AAA', 'ATA', 'TGT']
>>> from collections import Counter
>>> Counter([s[3*i:3*i+3] for i in xrange(0, len(s)/3)])
Counter({'ATA': 3, 'AAT': 2, 'AAG': 1, 'AAA': 1, 'TGT': 1, 'TTT': 1, 'TTA': 1})
答案 1 :(得分:1)
您正确解析,但结果是元组而不是所需的字符串,例如
>>> s = "ATATTAAAGAATAATTTTATAAAAATATGT"
>>> parse_sequence(s)
[('ATA',),
('TTA',),
('AAG',),
('AAT',),
('AAT',),
('TTT',),
('ATA',),
('AAA',),
('ATA',),
('TGT',)]
只需从此行中删除trailing comma:
...
codons.append((dna_sequence[i:i + 3],))
...
仅供参考,sliding window是一种可应用于密码子匹配的技术。以下是使用more_itertools.windowed
(第三方工具)的完整简化示例:
import collections as ct
import more_itertools as mit
def parse_sequence(dna_sequence):
"""Return a generator of codons."""
return ("".join(codon) for codon in mit.windowed(dna_sequence, 3, step=3))
def frequency(dna_sequence):
"""Return a Counter of codon frequency."""
parsed = parse_sequence(dna_sequence)
return ct.Counter(parsed)
测试
s = "ATATTAAAGAATAATTTTATAAAAATATGT"
expected = {'AAA': 1, 'AAG': 1, 'AAT': 2, 'ATA': 3, 'TGT': 1, 'TTA': 1, 'TTT': 1}
assert frequency(s) == expected