Question

我试图用甲酸消化这个字符串，但我正在尝试计算消化后得到的每个片段，我只是想知道如何将字典的值添加到我的新列表中。（任何建议将不胜感激）

import string

aa_seq = 'MLCPWNFLLKPRYRGKYEPGSSPAADLNNNEKGIGNEKSLVNGHIPNCETINPhSKSFP'

formic_acid = aa_seq.replace('A', 'A|').replace('N', 'N|').upper().split('|')

formate = list(formic_acid)

weights = {'A': 71.04, 'C': 103.01, 'D': 115.03, 'E': 129.04, 'F': 147.07,
           'G': 57.02, 'H': 137.06, 'I': 113.08, 'K': 128.09, 'L': 113.08,
           'M': 131.04, 'N': 114.04, 'P': 97.05, 'Q': 128.06, 'R': 156.10,
           'S': 87.03, 'T': 101.05, 'V': 99.07, 'W': 186.08, 'Y': 163.06 }
weight = []

for acid in formate:
    weight = weight + weights[acid]
print "The molecular weight of this protein is", weight

输出：

Traceback (most recent call last):
  File "r.py", line 15, in <module>
    weight = weight + weights[acid]
KeyError: 'MLCPWN'

Answer 1

如果你想要权重之和得到酸中每个字母的总和：

for acid in formate:
    weight = sum(weights[a] for a in acid)
print "The molecular weight of this protein is", weight

如果你想在列表中加入重量和酸：

weight_list = []
for acid in formate:
    weight = sum(weights[a] for a in acid)
    weight_list.append((acid,weight))
    print "The molecular weight of this protein is", weight



aa_seq = 'MLCPWNFLLKPRYRGKYEPGSSPAADLNNNEKGIGNEKSLVNGHIPNCETINPhSKSFP'

formic_acid = aa_seq.replace('A', 'A|').replace('N', 'N|').upper().split('|')

weights = {'A': 71.04, 'C': 103.01, 'D': 115.03, 'E': 129.04, 'F': 147.07,
           'G': 57.02, 'H': 137.06, 'I': 113.08, 'K': 128.09, 'L': 113.08,
           'M': 131.04, 'N': 114.04, 'P': 97.05, 'Q': 128.06, 'R': 156.10,
           'S': 87.03, 'T': 101.05, 'V': 99.07, 'W': 186.08, 'Y': 163.06 }

weight_list = []
for acid in formic_acid:
    weight = sum(weights[a] for a in acid)
    weight_list.append((acid,weight))
    print "The molecular weight of the protein {} is {}".format(acid,weight)
print(weight_list)

The molecular weight of the protein MLCPWN is 744.3
The molecular weight of the protein FLLKPRYRGKYEPGSSPA is 2047.06
The molecular weight of the protein A is 71.04
The molecular weight of the protein DLN is 342.15
The molecular weight of the protein N is 114.04
The molecular weight of the protein N is 114.04
The molecular weight of the protein EKGIGN is 598.29
The molecular weight of the protein EKSLVN is 670.35
The molecular weight of the protein GHIPN is 518.25
The molecular weight of the protein CETIN is 560.22
The molecular weight of the protein PHSKSFP is 780.38
[('MLCPWN', 744.3), ('FLLKPRYRGKYEPGSSPA', 2047.0599999999995), ('A', 71.04), ('DLN', 342.15000000000003), ('N', 114.04), ('N', 114.04), ('EKGIGN', 598.29), ('EKSLVN', 670.3499999999999), ('GHIPN', 518.25), ('CETIN', 560.22), ('PHSKSFP', 780.3799999999999)]

让min和max只跟踪循环：

mx = None
mn = None
for acid in formic_acid:
    weight = sum(weights[a] for a in acid)
    if mx is None or weight > mx:
        mx = weight
    if mn is None or weight < mn:
        mn = weight
    weight_list.append((acid,weight))
    print "The molecular weight of the protein {} is {}".format(acid,weight)
print("The minimum and maximum weights are:  {}, {}".format(mn,mx))

Answer 2

当您使用Biopython标记时，使用它会怎么样？

from Bio.SeqUtils.ProtParam import ProteinAnalysis


prot = ProteinAnalysis(
    "MLCPWNFLLKPRYRGKYEPGSSPAADLNNNEKGIGNEKSLVNGHIPNCETINPhSKSFP".upper())

# Biopython adds a water molecule to every aminoacid.
print prot.molecular_weight() - 18.02

就是这样！让我们介绍链分裂。我认为re.split：

会更好

import re
from Bio.SeqUtils.ProtParam import ProteinAnalysis


prot = "MLCPWNFLLKPRYRGKYEPGSSPAADLNNNEKGIGNEKSLVNGHIPNCETINPhSKSFP".upper()

for peptide in re.split("(A|N)", prot):
    print peptide, ProteinAnalysis(peptide).molecular_weigth() - 18.02

Answer 3

更改以下行

weight_list = []

体重= 0

对于甲酸中的酸：

    for w in acid:
            weight = weight + weights[w]
    weight_list.append(weight)

print“这种蛋白质的分子量是”，weight_list

如何使用mw字典添加新的字符串列表的分子量？

3 个答案: