使用循环的Dataframe的Concat列 - Python

时间:2018-05-07 00:46:51

标签: python dataframe concatenation

我想使用循环连接数据框的几列的值。

您可以找到实际的数据框:

 Artist_1                Artist_2   Artist_3
Lady Antebellum              ?         ?
Reba McEntire                ?         ?
Wanda Jackson                ?         ?
Carrie Underwood             ?         ?
       ?                     ?         ?
The Bellamy Brothers         ?         ?
Keith Urban          Miranda Lambert   ?
Sam Hunt                     ?         ?
Johnny Cash                  ?         ?
Johnny Cash            June Carter     ?
Highwaymen                   ?         ?
Loretta Lynn                 ?         ?
Sissy Spacek                 ?         ?
Loretta Lynn         Sheryl Crow    Miranda Lambert
Charley Pride                ?         ?

和预期结果:

Artist
Lady Antebellum
Reba McEntire
Wanda Jackson
Carrie Underwood
?
The Bellamy Brothers
Keith Urban, Miranda Lambert
Sam Hunt
Johnny Cash
Johnny Cash, June Carter
Highwaymen
Loretta Lynn
Sissy Spacek
Loretta Lynn,  Sheryl Crow, Miranda Lambert
Charley Pride

1 个答案:

答案 0 :(得分:0)

这是使用pd.DataFrame.apply / str.join后跟pd.Series.replace来解释不存在名称的实例的一种方式:

import pandas as pd

df = pd.DataFrame({'Artist_1': ['A', 'B', '?', 'D', '?', 'E'],
                   'Artist_2': ['?', '?', '?', 'G', '?', 'I'],
                   'Artist_3': ['J', '?', '?', '?', 'M', 'N']})

df['Artist_All'] = df.apply(lambda x: ', '.join([i for i in x if i != '?']), axis=1)\
                     .replace('', '?')

print(df)

  Artist_1 Artist_2 Artist_3 Artist_All
0        A        ?        J       A, J
1        B        ?        ?          B
2        ?        ?        ?          ?
3        D        G        ?       D, G
4        ?        ?        M          M
5        E        I        N    E, I, N

或者,您可以使用列表理解:

df['Artist_All'] = [', '.join([i for i in x if i != '?']) for x in df.values]
df['Artist_All'] = df['Artist_All'].replace('', '?')