Python_选择最长的AA序列

时间:2017-02-19 23:25:41

标签: python

我有以下序列,我正在尝试选择最长的序列。我知道它是第一个序列,但是如何告诉Python选择M(start)和*(stop)之间距离最长的序列?

First= MATVEPETTPTPNPPTTEEEKTESNQEVANPEHYIKHPLQNRWALWFFKNDKSKTWQANLRLISKFDTVEDFWALYNHIQLSSNLMPGCDYSLFKDGIEPMWEDEKNKRGGRWLITLNKQQRRSDLDRFWLETLLCLIGESFDDYSDDVCGAVVNVRAKGDKIAIWTTECENREAVTHIGRVYKERLGLPPKIVIGYQSHADTATKSGSTTKNRFVV*    
Second= WRLSNRKPPLLLIPRLQKRRKRNLIRRLLTQNTILNIPYRTDGHSGFLKMIKAKLGKQTCG*SPSLILLKTFGLCTTISSCLVI*CLAVTTHFLRMVLSLCGKMRKTNGEDDG*LH*TNSRDEVTSIAFG*RHFCALLENLLMTTVMMYVALLLMLELKVIR*QYGLLNVKTEKLLHI*GGYTRKG*DFLQR**LVISPTQTQLLRAAPPLKIGLLF    
Third= GDCRTGNHPYS*SPDYRRGENGI*SGGC*PRTLY*TSPTEQMGTLVF*K**KQNLASKPAADLQV*YC*RLLGSVQPYPVV**FNAWL*LLTF*GWY*AYVGR*EKQTGRTMANYIEQTAETK*PRSLLARDTSVPYWRIF**LQ**CMWRCC*C*S*R**DSNMDY*M*KQRSCYTYREGIQGKVRTSSKDSDWLSVPRRHSY*ERLHH*K*VCCL    
Fourth= LNNKPIFSGGAALSSCVCVGLITNHYLWRKS*PFLVYPPYMCNSFSVFTFSSPYCYLITFSSNINNSATYIITVVIKRFSNKAQKCL*PKAIEVTSSLLFVQCN*PSSSPFVFLIFPHRLNTILKK*VVTARH*ITRQLDMVVQSPKVFNSIKLGDQPQVCLPSFAFIIFKKPECPSVL*GMFNIVFWVSNLLIRFRFLLFCSRGIRSRGGFRFDSRH    
Fifth= *TTNLFLVVEPLLVAVSAWD**PITIFGGSPNLSLYTLPICVTASLFSHSVVHIAILSPLALTLTTAPHTSSL*SSKDSPIRHRSVSSQKRSRSLRLCCLFNVISHRPPRLFFSSSHIGSIPSLKSE*SQPGIKLLDNWIWLYRAQKSSTVSNLEISRRFACQVLLLSFLKNQSAHLFCRGCLI*CSGLATS*LDSVFSSSVVGGLGVGVVSGSTVA    
Sixth= KQQTYF*WWSRS**LCLRGTDNQSLSLEEVLTFPCIPSLYV*QLLCFHIQ*SILLSYHL*L*H*QQRHIHHHCSHQKILQ*GTEVSLAKSDRGHFVSAVCSM*LAIVLPVCFSHLPT*AQYHP*KVSSHSQALNY*TTGYGCTEPKSLQQYQTWRSAAGLLAKFCFYHF*KTRVPICSVGDV*YSVLG*QPPD*IPFSPLL*SGD*E*GWFPVRQSP

1 个答案:

答案 0 :(得分:1)

为了帮助您入门,我们提供了一些基本命令来获取您感兴趣的长度。我定义了一个函数来检查每个字符的第一个元素之间的长度。

First= 'MATVEPETTPTPNPPTTEEEKTESNQEVANPEHYIKHPLQNRWALWFFKNDKSKTWQANLRLISKFDTVEDFWALYNHIQLSSNLMPGCDYSLFKDGIEPMWEDEKNKRGGRWLITLNKQQRRSDLDRFWLETLLCLIGESFDDYSDDVCGAVVNVRAKGDKIAIWTTECENREAVTHIGRVYKERLGLPPKIVIGYQSHADTATKSGSTTKNRFVV*'
Second= 'WRLSNRKPPLLLIPRLQKRRKRNLIRRLLTQNTILNIPYRTDGHSGFLKMIKAKLGKQTCGSPSLILLKTFGLCTTISSCLVICLAVTTHFLRMVLSLCGKMRKTNGEDDGLHTNSRDEVTSIAFGRHFCALLENLLMTTVMMYVALLLMLELKVIRQYGLLNVKTEKLLHIGGYTRKGDFLQR**LVISPTQTQLLRAAPPLKIGLLF'
Third= 'GDCRTGNHPYSSPDYRRGENGISGGCPRTLYTSPTEQMGTLVFKKQNLASKPAADLQVYCRLLGSVQPYPVVFNAWLLLTFGWYAYVGREKQTGRTMANYIEQTAETKPRSLLARDTSVPYWRIFLQCMWRCCCS*R**DSNMDYMKQRSCYTYREGIQGKVRTSSKDSDWLSVPRRHSYERLHHK*VCCL'
Fourth= 'LNNKPIFSGGAALSSCVCVGLITNHYLWRKSPFLVYPPYMCNSFSVFTFSSPYCYLITFSSNINNSATYIITVVIKRFSNKAQKCLPKAIEVTSSLLFVQCNPSSSPFVFLIFPHRLNTILKKVVTARHITRQLDMVVQSPKVFNSIKLGDQPQVCLPSFAFIIFKKPECPSVLGMFNIVFWVSNLLIRFRFLLFCSRGIRSRGGFRFDSRH'
Fifth= '*TTNLFLVVEPLLVAVSAWD**PITIFGGSPNLSLYTLPICVTASLFSHSVVHIAILSPLALTLTTAPHTSSLSSKDSPIRHRSVSSQKRSRSLRLCCLFNVISHRPPRLFFSSSHIGSIPSLKSESQPGIKLLDNWIWLYRAQKSSTVSNLEISRRFACQVLLLSFLKNQSAHLFCRGCLICSGLATSLDSVFSSSVVGGLGVGVVSGSTVA'
Sixth= 'KQQTYF*WWSRS**LCLRGTDNQSLSLEEVLTFPCIPSLYVQLLCFHIQSILLSYHLLHQQRHIHHHCSHQKILQGTEVSLAKSDRGHFVSAVCSMLAIVLPVCFSHLPTAQYHPKVSSHSQALNYTTGYGCTEPKSLQQYQTWRSAAGLLAKFCFYHFKTRVPICSVGDVYSVLGQPPDIPFSPLLSGDE*GWFPVRQSP'

def sequence_length( string, char1, char2 ):
    try:
        return string.index(char2) - string.index(char1)
    except ValueError:
        return None

print( sequence_length( First, 'M', '*' ) )
print( sequence_length( Second, 'M', '*' ) )
print( sequence_length( Third, 'M', '*' ) )
print( sequence_length( Fourth, 'M', '*' ) )
print( sequence_length( Fifth, 'M', '*' ) )
print( sequence_length( Sixth, 'M', '*' ) )

返回:

217
135
98
None
None
-89