我正在开发一个脚本,用于计算给定序列中的元素。我已经找到了改进这项任务的方法,但我想知道当字符串中包含的字母不是实际计数的字母以及如何打印时,是否可以使用字典。
例如:
sequence = str(input('Enter DNA sequence:'))
print ('Your sequence contain:',len(sequence), 'bases', 'with the following
structure:')
adenine = sequence.count("A") + sequence.count("a")
thymine = sequence.count("T") + sequence.count("t")
cytosine = sequence.count("C") + sequence.count("c")
guanine = sequence.count ("G") + sequence.count("g")
print("adenine =", adenine)
print("thymine=", thymine)
print("cytosine=", cytosine)
print("guanine=", guanine)
我在这样的字典中思考: dicc = {腺嘌呤:[" A"," a"],胸腺嘧啶:[" T" " T&#34], 胞嘧啶:[" C"," c"],鸟嘌呤:[" G"," g"]
}
但我不知道如果在序列中给出那些不是核苷酸的那些字母,例如,按照以下顺序,结果应该是这样的:
sequence = AacGTtxponwxs:
your sequence contain 13 bases with the following structure:
adenine = 2
thymine = 2
cytosine = 1
thymine = 2
p is not a DNA value
x is not a DNA value
o is not a DNA value
n is not a DNA value
w is not a DNA value
s is not a DNA value
答案 0 :(得分:1)
使用collections.Counter
(类似dict
类),你可以更干嘛:
from collections import Counter
sequence = 'AacGTtxponwxs'
s = sequence.lower()
bases = ['adenine', 'thymine', 'cytosine', 'guanine']
non_bases = [x for x in s if x not in (b[0] for b in bases)]
c = Counter(s)
for base in bases:
print('{} = {}'.format(base, c[base[0]]))
# adenine = 2
# thymine = 2
# cytosine = 1
# guanine = 1
for n in non_bases:
print('{} is not a DNA value'.format(n))
# o is not a DNA value
# n is not a DNA value
# p is not a DNA value
# s is not a DNA value
# w is not a DNA value
# x is not a DNA value
答案 1 :(得分:0)
试试这个
sequence = 'AacGTtxponwxs'
adenine = 0
thymine = 0
cytosine = 0
guanine = 0
outputstring = []
for elem in sequence:
if elem in ('a','A'):
adenine += 1
elif elem in ('T','t'):
thymine += 1
elif elem in ('C','c'):
cytosine += 1
elif elem in ('G','g'):
guanine += 1
else:
outputstring.append('{} is not a DNA value'.format(elem))
print ('your sequence contain {} bases with the following structure:'.format(len(sequence)))
print ('adenine = ',adenine )
print ('thymine = ',thymine )
print ('cytosine = ',cytosine )
print ('thymine = ',guanine )
print ("\n".join(outputstring))
输出:
your sequence contain 13 bases with the following structure:
adenine = 2
thymine = 2
cytosine = 1
thymine = 1
x is not a DNA value
p is not a DNA value
o is not a DNA value
n is not a DNA value
w is not a DNA value
x is not a DNA value
s is not a DNA value
答案 2 :(得分:0)
#Are you studying bioinformatics at HAN? I remember this as my assignment lol
#3 years ago
sequence = str(input('Enter DNA sequence:'))
sequence.lower()
count_sequence = 0
countA = 0
countT = 0
countG = 0
countC = 0
countNotDNA = 0
for char in sequence:
if char in sequence:
count_sequence+=1
if char == 'a':
countA +=1
if char == 't':
countT +=1
if char == 'g':
countG +=1
if char == 'c':
countC +=1
else:
countNotDNA+=1
print("sequence is", count_sequence, "characters long containing:","\n", countA, "Adenine","\n", countT, "Thymine","\n", countG, "Guanine","\n", countC, "Cytosine","\n", countNotDNA, "junk bases")
你去:) :)