如何将几个字符串放入for循环的列表中?

时间:2017-09-16 18:09:58

标签: python string list

我使用for循环搜索NCBI蛋白质数据库中的蛋白质ID列表,并尝试将这些ID转换为描述。这是一个例子:

import pandas as pd
from Bio import Entrez
from Bio import SeqIO

df2=pd.read_csv('ID.txt', header=None)
df.columns = ['protein_ID']  # put a header 'protein_ID' to the dataframe
lists=df.protein_ID.tolist() #convert the column into a list of protein IDs.

description = ''
for num, line in enumerate(lists):
    handle = Entrez.efetch(db="protein", id=line, rettype="gb", retmode="text")
    record = SeqIO.read(handle, "genbank")
    description += record.description

description

它返回一个巨大的字符串:

'hypothetical protein UR61_C0009G0014 [candidate division WS6 bacterium GW2011_GWE1_34_7]ATPase [candidate division WS6 bacterium GW2011_GWE2_33_157]hypothetical protein UR96_C0034G0007 [candidate division WS6 bacterium GW2011_GWC1_36_11]phosphoenolpyruvate synthase [Candidatus Komeilibacteria bacterium RIFOXYC1_FULL_37_11]'

我想要的是带有新换行符的字符串列表,如下所示:

[
'hypothetical protein UR61_C0009G0014 [candidate division WS6 bacterium GW2011_GWE1_34_7]',
'ATPase [candidate division WS6 bacterium GW2011_GWE2_33_157]',
'hypothetical protein UR96_C0034G0007 [candidate division WS6 bacterium GW2011_GWC1_36_11]',
'phosphoenolpyruvate synthase [Candidatus Komeilibacteria bacterium RIFOXYC1_FULL_37_11]'
]

如何实现这一目标?非常感谢你!

1 个答案:

答案 0 :(得分:0)

  

我想要的是一个字符串列表

description = []
for num, line in enumerate(lists):
    ....
    description.append(record.description)
  

有新的换行符

默认情况下,不会以这种方式打印列表,请使用pprint

import pprint

# you original code here

pprint.pprint(description)