我有以下数据
>P1;gi|467971|gb|AA3.1|
-MASLAALLPLLALLVLCRLDPAQA
QAEPGAGG-LQELALQ---KRGIVE
QCCTSICSLYQLEN---
*
>P1;gi|307072|gb|AAA59179.1|
-MALWMRLLPLLALLALWGPDPAAA
FPK-TR-EAPGAGS-LEGSLQ--KRE
QCCTSICSLYQLENYCN
*
>P1;gi|387059|gb|AAA31.1|
-MALVLALLALWNTNQAFVS-RHLC
FYIPK-DRREG-LQLQ---KRGIVD
QCCTGTCTRHQLQS---
*
在python中,我如何将这些转换为如下所示的数据
-MASLAALLPLLALLVLCRLDPAQAQAEPGAGG-LQELALQ --- KRGIVEQCCTSICSLYQLEN ---, - MALWMRLLPLLALLALWGPDPAAAFPK-TR-EAPGAGS-LEGSLQ - KREQCCTSICSLYQLENYCN,-MALVLALLALWNTNQAFVS-RHLCFYIPK-DRREG-LQLQ --- KRGIVDQCCTGTCTRHQLQS ---
答案 0 :(得分:0)
data
是你的“字符串”的一种愚蠢方式:
>>> lines = data.replace('*', ',').splitlines()
>>> ''.join(line for line in lines if line and not line.startswith('>')).rstrip(',')
' - MASLAALLPLLALLVLCRLDPAQAQAEPGAGG-LQELALQ --- --- KRGIVEQCCTSICSLYQLEN, - MALWMRLLPLLALLALWGPDPAAAFPK-TR-EAPGAGS-LEGSLQ - KREQCCTSICSLYQLENYCN,-MALVLALLALWNTNQAFVS-RHLCFYIPK-DRREG-LQLQ --- --- KRGIVDQCCTGTCTRHQLQS'
答案 1 :(得分:0)
考虑到file1.txt中的数据可用,那么您可以使用这段代码:
file_handle = open(r'C:\Users\kvivek\Desktop\file1.txt', 'r')
fileContent = file_handle.readlines()
file_handle.close()
output = ''
for line in fileContent:
if ">P1;gi" in line:
continue
x = ''.join(line.strip())
output = output + x
// replace all * with comma and then use strip function used to remove the last comma
finalOutput = output.replace("*",",").rstrip(',')
print finalOutput