在一行中连接非重复行 - python

时间:2018-06-07 19:31:09

标签: python database string function

我寻求一些小但令人困惑的数据提示:

P2  Chain   161771  642 ID=0000025456
P2  Chain   161771  642 ID=0000438090
P2  Chain   161771  642 ID=0000438071
P2  Chain   161771  642 ID=00438072
P2  Chain   161771  642 ID=011423689
P2  Chain   161771  642 ID=002655525

在此数据中,最后一个列中仅存在非重复值。我想要的是使用脚本/函数简化这些数据,以便在以下相同的行中连接这些值:

P2  Chain   161771  642 ID=0000025456, 0000438071,0000438090, 002655525, 011423689, 00438072

1 个答案:

答案 0 :(得分:0)

text = ''' P2  Chain   161771  642 ID=0000025456
P2  Chain   161771  642 ID=0000438090
P2  Chain   161771  642 ID=0000438071
P2  Chain   161771  642 ID=00438072
P2  Chain   161771  642 ID=011423689
P2  Chain   161771  642 ID=002655525'''

ids = [] # We will store the ids here
for line in text.splitlines(): # break the text block into lines and iterate over them
    split_line = line.split('=') # break the line into two pieces, before and after the '='
    id = split_line[1] # set id to be the part after '=', ie: the ID
    ids.append(id)

print ('P2 Chain 161771 642 ID=' + str(ids)) 

输出:

P2 Chain 161771 642 ID=['0000025456', '0000438090', '0000438071', '00438072', '011423689', '002655525']