Question

我一直在处理一个项目，要求我计算1＆0和＃0的数量它描述了氨基酸对肽稳定性的影响。文件中有大约300种不同的肽序列。我希望我的代码能够从我的文本文件中识别肽序列的开始，计算其长度，然后计算每个氨基酸记录的1和0的数量。到目前为止，我一直在努力让我的代码使用索引编号来识别序列的开始，这里有我所拥有的

 input_file01=open (r'C:/Users/12345/Documents/Dr Blan Research/MHC I 17 NOV2016.txt') 
    Output_file01= open ('MHC I 17 NOV2016OUT.txt','w') 
    for line in input_file01:
        templist=line.split()
        a=line[0]
        for i in range(0,len(a)):
            if a[i]==1:
                b=line[0+1]
                index=i
                count=+1
                Output_file01.write(a)
                Output_file01.write(b)
            else:
                break 


Here is an example of the content in the file. I want my code to count the peptide sequence, count the number of 1's and 0's and find their ratios within each peptide seq.                                                                                    

    #   1 - Amino acid number
    #   2 - One letter code
    #   3 - ANCHOR probability value
    #   4 - ANCHOR output
    #   
    1   A         0.3129    0
    2   P         0.4044    0
    3   K         0.5258    1
    4   R         0.6358    1
    5   P         0.7277    1
    6   P         0.7895    1
    7   S         0.8710    1
    8   A         0.9358    1
    9   F         0.9680    1

将输入文件中的特定行和字符写入输出文件

0 个答案: