有问题将字符串连接到列表而没有\ n - Python3

时间:2018-04-06 23:37:31

标签: python-3.x

我目前在尝试将字符串附加到新列表时遇到了一些问题。但是,当我结束时,我的列表看起来像这样:

[ 'MDAALLLNVEGVKKTILHGGTGELPNFITGSRVIFHFRTMKCDEERTVIDDSRQVGQPMH \ nIIIGNMFKLEVWEILLTSMRVHEVAEFWCDTIHTGVYPILSRSLRQMAQGKDPTEWHVHT \ nCGLANMFAYHTLGYEDLDELQKEPQPLVFVIELLQVDAPSDYQRETWNLSNHEKMKAVPV \ nLHGEGNRLFKLGRYEEASSKYQEAIICLRNLQTKEKPWEVQWLKLEKMINTLILNYCQCL \ nLKKEEYYEVLEHTSDILRHHPGIVKAYYVRARAHAEVWNEAEAKADLQKVLELEPSMQKA \ nVRRELRLLENRMAEKQEEERLRCRNMLSQGATQPPAEPPTEPPAQSSTEPPAEPPTAPSA \ nELSAGPPAEPATEPPPSPGHSLQH \ N']

我想以某种方式删除换行符。我在这里查看了其他问题,大多数建议使用.rstrip,但是在我的代码中添加相同的输出。我在这里错过了什么?如果有人提出这个问题,请道歉。

我的输入也看起来像这样(前3行):

  

sp | Q9NZN9 | AIPL1_HUMAN芳基 - 烃相互作用蛋白样1 OS = Homo sapiens OX = 9606 GN = AIPL1 PE = 1 SV = 2   MDAALLLNVEGVKKTILHGGTGELPNFITGSRVIFHFRTMKCDEERTVIDDSRQVGQPMH   IIIGNMFKLEVWEILLTSMRVHEVAEFWCDTIHTGVYPILSRSLRQMAQGKDPTEWHVHT

  from sys import argv 

protein = argv[1] #fasta file 

sequence = '' #string linker 
get_line = False #False = not the sequence 
Uniprot_ID = []
sequence_list =[]
with open(protein) as pn:
    for line in pn: 
        line.rstrip("\n")
        if line.startswith(">") and get_line == False:
            sp, u_id, name = line.strip().split('|')
            Uniprot_ID.append(u_id) 
            get_line = True 
            continue
        if line.startswith(">") and get_line == True:
            sequence.rstrip('\n')
            sequence_list.append(sequence) #add the amino acids onto the list
            sequence = ''  #resets the str
        if line != ">" and get_line == True: #if the first line is not a fasta ID and is it a sequence? 
            sequence += line 
print(sequence_list)

2 个答案:

答案 0 :(得分:1)

documentationrstrip删除尾随字符 - 最后的字符。您可能误解了其他人使用它来删除\n,因为通常那些会出现在最后。

要用整个字符串中的其他内容替换字符,请改用replace

这些命令修改你的字符串!它们返回 new 字符串,因此如果要更改当前字符串变量中的某些内容,请将结果返回给原始变量:

>>> line = 'ab\ncd\n'
>>> line.rstrip('\n')
'ab\ncd'        # note: this is the immediate result, which is not assigned back to line
>>> line = line.replace('\n', '')
>>> line
'abcd'

答案 1 :(得分:0)

当我问这个问题时,我没有花时间查看文档和文档。理解我的代码。看了之后,我意识到了两件事:

  1. 我的代码实际上并没有得到我感兴趣的内容。
  2. 对于我问的具体问题,我可以简单地使用line.split()删除'\ n'。

    sequence = '' #string linker 
    get_line = False #False = not the sequence 
    uni_seq = {}
    """this block of code takes a uniprot FASTA file and creates a
    dictionary with the key as the uniprot id and the value as a sequence""" 
    with open (protein) as pn:
        for line in pn: 
            if line.startswith(">"):
                if get_line == False:
                    sp, u_id, name = line.strip().split('|')
                    Uniprot_ID.append(u_id)
                    get_line = True 
                else:
                    uni_seq[u_id] = sequence
                    sequence_list.append(sequence)
                    sp, u_id, name = line.strip().split('|')
                    Uniprot_ID.append(u_id)
                    sequence = '' 
            else:
                if get_line == True: 
                sequence += line.strip() # removes the newline space 
    uni_seq[u_id] = sequence
    sequence_list.append(sequence)