对于我们的python项目,我们必须解决多个问题。然而,我们坚持这一点:
"编写一个函数,给定FASTA文件名,返回一个字典,序列ID为键,元组为值。该值表示序列的最小和最大分子量(序列可以是模糊的)。"
import collections
from Bio import Seq
from itertools import product
def ListMW(file_name):
seq_records = SeqIO.parse(file_name, 'fasta',alphabet=generic_dna)
for record in seq_records:
dictionary = Seq.IUPAC.IUPACData.ambiguous_dna_values
result = []
for i in product(*[dictionary[j] for j in record]):
result.append("".join(i))
molw = []
for sequence in result:
molw.append(SeqUtils.molecular_weight(sequence))
tuple= (min(molw),max(molw))
if min(molw)==max(molw):
dict={record.id:molw}
else:
dict={record.id:(min(molw), max(molw))}
print(dict)
使用此代码我们设法获得此输出:
{'seq_7009': (6236.9764, 6367.049999999999)}
{'seq_418': (3716.3642000000004, 3796.4124000000006)}
{'seq_9143_unamb': [4631.958999999999]}
{'seq_2888': (5219.3359, 5365.4089)}
{'seq_1101': (4287.7417, 4422.8254)}
{'seq_107': (5825.695099999999, 5972.8073)}
{'seq_6946': (5179.3118, 5364.420900000001)}
{'seq_6162': (5531.503199999999, 5645.577399999999)}
{'seq_504': (4556.920899999999, 4631.959)}
{'seq_3535': (3396.1715999999997, 3446.1969999999997)}
{'seq_4077': (4551.9108, 4754.0073)}
{'seq_1626_unamb': [3724.3894999999998]}
正如您所看到的,这不是一本字典而是多本字典。那么无论如何我们可以改变我们的代码或输入一个额外的命令来获得这种格式:
{'seq_7009': (6236.9764, 6367.049999999999),
'seq_418': (3716.3642000000004, 3796.4124000000006),
'seq_9143_unamb': (4631.958999999999),
'seq_2888': (5219.3359, 5365.4089),
'seq_1101': (4287.7417, 4422.8254),
'seq_107': (5825.695099999999, 5972.8073),
'seq_6946': (5179.3118, 5364.420900000001),
'seq_6162': (5531.503199999999, 5645.577399999999),
'seq_504': (4556.920899999999, 4631.959),
'seq_3535': (3396.1715999999997, 3446.1969999999997),
'seq_4077': (4551.9108, 4754.0073),
'seq_1626_unamb': (3724.3894999999998)}
或者在某种程度上设法明确它应该使用seq_ID ans键和分子量作为一个字典的值?
答案 0 :(得分:2)
在for循环之前设置 dictionnary ,然后在循环期间更新它,例如:
coupon_lables = [x.text for x in market.find_elements_by_class_name('sm-CouponLink_Label')]
for label in coupon_lables:
time.sleep(5)
driver.find_element_by_xpath(f'//div[contains(text(), "' + label + '")]').click()
time.sleep(5)
driver.find_element_by_class_name('cl-BreadcrumbTrail_BackButton').click()
现在,您在循环的每个转弯处设置新词典并打印出来。打印大量字典只是代码的正常行为。
答案 1 :(得分:1)
你在每次迭代时创建一个带有1个条目的字典。
你想要:
import collections
from Bio import Seq
from itertools import product
def ListMW(file_name):
seq_records = SeqIO.parse(file_name, 'fasta',alphabet=generic_dna)
retDict = {}
for record in seq_records:
dictionary = Seq.IUPAC.IUPACData.ambiguous_dna_values
result = []
for i in product(*[dictionary[j] for j in record]):
result.append("".join(i))
molw = []
for sequence in result:
molw.append(SeqUtils.molecular_weight(sequence))
tuple= (min(molw),max(molw))
if min(molw)==max(molw):
retDict[record.id] = molw
else:
retDict[record.id] = (min(molw), max(molw))}
# instead of printing now, print in the end of your function / script
# print(dict)
变量(更好地使用dict
以避免重复使用内置类型名称)dct
所以在循环之前:
dict
并在循环中(而不是您的dct = {}
+ if
代码),在三元表达式中,使用min& max只计算一次:
dict =