python中格式的转换

时间:2013-02-07 17:14:07

标签: python-2.7 formatting

我有以下数据

>P1;gi|467971|gb|AA3.1|

-MASLAALLPLLALLVLCRLDPAQA
QAEPGAGG-LQELALQ---KRGIVE
QCCTSICSLYQLEN---
*
>P1;gi|307072|gb|AAA59179.1|

-MALWMRLLPLLALLALWGPDPAAA
FPK-TR-EAPGAGS-LEGSLQ--KRE
QCCTSICSLYQLENYCN
*
>P1;gi|387059|gb|AAA31.1|

-MALVLALLALWNTNQAFVS-RHLC
FYIPK-DRREG-LQLQ---KRGIVD
QCCTGTCTRHQLQS---
*

在python中,我如何将这些转换为如下所示的数据

-MASLAALLPLLALLVLCRLDPAQAQAEPGAGG-LQELALQ --- KRGIVEQCCTSICSLYQLEN ---, - MALWMRLLPLLALLALWGPDPAAAFPK-TR-EAPGAGS-LEGSLQ - KREQCCTSICSLYQLENYCN,-MALVLALLALWNTNQAFVS-RHLCFYIPK-DRREG-LQLQ --- KRGIVDQCCTGTCTRHQLQS ---

2 个答案:

答案 0 :(得分:0)

data是你的“字符串”的一种愚蠢方式:

>>> lines = data.replace('*', ',').splitlines()
>>> ''.join(line for line in lines if line and not line.startswith('>')).rstrip(',')
  

' - MASLAALLPLLALLVLCRLDPAQAQAEPGAGG-LQELALQ --- --- KRGIVEQCCTSICSLYQLEN, - MALWMRLLPLLALLALWGPDPAAAFPK-TR-EAPGAGS-LEGSLQ - KREQCCTSICSLYQLENYCN,-MALVLALLALWNTNQAFVS-RHLCFYIPK-DRREG-LQLQ --- --- KRGIVDQCCTGTCTRHQLQS'

答案 1 :(得分:0)

考虑到file1.txt中的数据可用,那么您可以使用这段代码:

file_handle = open(r'C:\Users\kvivek\Desktop\file1.txt', 'r')
fileContent = file_handle.readlines()
file_handle.close()

output = ''
for line in fileContent:
    if ">P1;gi" in line:
        continue
    x = ''.join(line.strip())
    output = output + x

// replace all * with comma and then use strip function used to remove the last comma
finalOutput = output.replace("*",",").rstrip(',')
print finalOutput