Python列表切片序列遗传密码

时间:2015-12-05 10:52:21

标签: python sequence

我有一个列表,我想从字典中提取每个元素的值,我想返回  获取从AUG开始到'*'开始的新列表或字典中不存在的元素。

例如:

CodD = {"UUU":"F", "UUC":"F", "UUA":"L", "UUG":"L",
 "UCU":"S", "UCC":"s", "UCA":"S", "UCG":"S",
 "UAU":"Y", "UAC":"Y", "UAA":"*", "UAG":"*",
 "UGU":"C", "UGC":"C", "UGA":"*", "UGG":"W",
 "CUU":"L", "CUC":"L", "CUA":"L", "CUG":"L",
 "CCU":"P", "CCC":"P", "CCA":"P", "CCG":"P",
 "CAU":"H", "CAC":"H", "CAA":"Q", "CAG":"Q",
 "CGU":"R", "CGC":"R", "CGA":"R", "CGG":"R",
 "AUU":"I", "AUC":"I", "AUA":"I", "AUG":"M",
 "ACU":"T", "ACC":"T", "ACA":"T", "ACG":"T",
 "AAU":"N", "AAC":"N", "AAA":"K", "AAG":"K",
 "AGU":"S", "AGC":"S", "AGA":"R", "AGG":"R",
 "GUU":"V", "GUC":"V", "GUA":"V", "GUG":"V",
 "GCU":"A", "GCC":"A", "GCA":"A", "GCG":"A",
 "GAU":"D", "GAC":"D", "GAA":"E", "GAG":"E",
 "GGU":"G", "GGC":"G", "GGA":"G", "GGG":"G",}
输入列表

['UAU', 'AUG', 'AAA', 'UAG', 'CAA', 'GUU', 'UUA', 'UUU', 'AAA', 'UAA', 'GGG',
 'UUU', 'AAA', 'UAC', 'AUU', 'ACA', 'CAU', 'AAC', 'AUU', 'UAG', 'ACU', 'UAG',
 'GGG', 'AUG', 'AAA', 'AAA', 'ACC', 'AAA', 'AAC', 'CAG', 'UUU', 'GUU', 'ACU',
 'UAA', 'CAU', 'GGC', 'AUU', 'GGG', 'CAG']

结果将是

['M','K']
['M','K','K','T','K','Q']

这些由以下形成:

  1. 查找列表中的第一个'AUG'元素,这将启动输出序列
  2. CodD[element]的每个结果添加到输出序列
  3. 如果CodD[element]不存在,或'*'为<{1}},则输出序列结束。
  4. 返回1.直到输入列表用尽
  5. 在这样的序列中,一旦开始,再次找到'AUG'并不重要。

1 个答案:

答案 0 :(得分:1)

您可以使用生成器:

def sequences(mapping, lst):
    result = None
    for elem in lst:
        if elem == 'AUG' and result is None:
            # start a new list
            result = []
        if result is None:
            # not currently creating an input sequence, ignore this element
            continue
        value = mapping.get(elem)
        if value is None or value == '*':
            # sequence end
            yield result
            result = None
            continue
        result.append(value)

演示:

>>> # I named your list 'sample' here
...
>>> for result in sequences(CodD, sample):
...     print result
...
['M', 'K']
['M', 'K', 'K', 'T', 'K', 'N', 'Q', 'F', 'V', 'T']