我是python的新手,在过去的一周里我一直在努力做到这一点,有人可以帮助我解决这个问题,这对完成我的项目非常有帮助。
我尝试根据给定序列的用户输入进行单突变及其2,3种组合:
INPUT SEQUENCE:> PEACCEL
用户突变输入文件:
E2R
C4W
E6G
#!/usr/bin/python
import getopt
import sys
import itertools as it
from itertools import groupby
def main(argv):
try:
opts,operands = getopt.getopt(sys.argv[1:],'i:m:o:'["INPUT_FILE:=","MUTATIONFILE:=","OUTPUT_FILE:=","help"])
if len(opts) == 0:
print "Please use the correct arguments, for usage type --help "
else:
for option,value in opts:
if option == "-i" or option == "--INPUT_FILE:":
seq=inputFile(value)
if option == "-m" or option == "--MUTATION_FILE:":
conA=MutationFile(value)
if option == "-o" or option == "--OUTPUT_FILE:":
out=outputName(value)
return seq,conA
except getopt.GetoptError,err:
print str(err)
print "Please use the correct arguments, for usage type --help"
def inputFile(value):
try:
fh = open(value,'r')
except IOError:
print "The file %s does not exist \n" % value
else:
ToSeperate= (x[1] for x in groupby(fh, lambda line: line[0] == ">"))
for header in ToSeperate:
header = header.next()[1:].strip()
Sequence = "".join(s.strip() for s in ToSeperate.next())
return Sequence
def MutationFile(value):
try:
fh=open(value,'r')
content=fh.read()
Rmcontent=str(content.rstrip())
except IOError:
print "The file %s does not exist \n" % MutFile
else:
con=list(Rmcontent)
return con
def Mutation(SEQUENCES,conA):
R=len(conA)
if R>1:
out=[]
SecondNum=1
ThirdChar=2
for index in range(len(conA)):
MR=conA[index]
if index==SecondNum:
SN=MR
SecondNum=SecondNum+4
if index==ThirdChar:
TC=MR
ThirdChar=ThirdChar+4
SecNum=int(SN.rstrip())
MutateResidue=str(TC.rstrip())
for index in range(len(SEQUENCES)):
if index==SecNum-1:
NonMutate=SEQUENCES[index]
AfterMutate=NonMutate.replace(NonMutate,MutateResidue)
new=SEQUENCES[ :index]+AfterMutate+SEQUENCES[index+1: ]
MutatedInformation= ['>',NonMutate,index+1,MutateResidue,'\n',new]
values2 = ''.join(str(i)for i in MutatedInformation)
if __name__ == "__main__":
seq,conA=main(sys.argv[1:])
Mutation(seq,conA)
这是我的程序部分,我将(2,4,6)的R,W,G替换为E,C,E然后将这些替换的字母存储到名为R的变量中,其中包含三行,如下所示: -
PrACCEL
PEAwCEL
PEACCgL
现在,我想从这三个单突变中做出2个和3个组合。 这就像是一行中两个突变的梳子和一行中的三个突变。
样本和预期输出将如下:
2C
PrAwCEL
PrACCgL
PEAwCgL
3C
PrAwCgL
算法
他是我的代码的一部分所以我将解释我的算法
1.我读取了具有三个字符的突变文件,例如(E2R)其中(E)是氨基酸字母,它是(2)输入序列PEACCEL的位置,第三个字母(R)是E2将是R.
2.首先,我从用户变异文件中提取位置和第三个变量,并将它们存储到变量SecNum和MutateResidue(thirdchar)中。
3.然后,我用循环来读取索引的序列(PEACCEL),然后无论哪个索引与SecNUm(E2,4,6)匹配,我都用Mutate Residue替换那些序列,Mutate Residue是变异文件中的第三个字符( 2R,4W,6G)
4.最后我通过这一行加入突变残基指数和其他残基:( new = SEQUENCES [:index] + AfterMutate + SEQUENCES [index + 1:]
提前致谢
答案 0 :(得分:0)
from itertools import combinations,chain
from collections import Counter
def Mutation(SEQUENCES,conA):
#mutations=map(lambda x:x.strip(),open('a.txt','r').readlines())
mutation_combinations= chain.from_iterable([list(combinations(conA,i))for i in range(1,4)])
#[('E2R',), ('C4W',), ('E6G',), ('E2R', 'C4W'), ('E2R', 'E6G'), ('C4W', 'E6G'), ('E2R', 'C4W', 'E6G')]
for i in mutation_combinations:
print "combination :"+'_'.join(i)
c=Counter({})
temp_string=SEQUENCES
for j in i:
c[j[1]]=j[2].lower()
for index,letter in c.items():
temp_string=temp_string[:int(index)-1]+letter+temp_string[int(index):]
print temp_string
combination :E2R
PrACCEL
combination :C4W
PEAwCEL
combination :E6G
PEACCgL
combination :E2R_C4W
PrAwCEL
combination :E2R_E6G
PrACCgL
combination :C4W_E6G
PEAwCgL
combination :E2R_C4W_E6G
PrAwCgL
我遵循的算法:
mutations=map(lambda x:x.strip(),open('a.txt','r').readlines())
mutation_combinations= chain.from_iterable([list(combinations(mutations,i))for i in range(1,4)])
个突变,您希望所有四个更改4
range value to 5
的组合
因此,对于每个组合,我用指定的字符替换它们
for j in i:
c[j[1]]=j[2].lower()
我使用上面的计数器来跟踪在突变组合期间要替换的字符