根据组合组合列

时间:2015-10-07 19:33:51

标签: python pandas

我有一个这样的数据框:

         POLY_KEY_I     SP1     SP2
0   FS01080100SM001  POAPRA  TOXRYD
1   FS01080100SM001     NaN     NaN
2   FS01080100SM001   OXRYD  SYMOCC
3   FS01080100SM001  EUPESU  POAPRA
4   FS01080100SM001  BOUGRA  KOEPYR
5   FS01080100SM002  POAPRA  EUPESU
6   FS01080100SM002  POAPRA     NaN
7   FS01080100SM002  POAPRA  KOEPYR

我希望通过POLY_KEY_I进行分组,然后根据此结合SP1SP2

我想要的输出类似于:

         POLY_KEY_I      SP
0   FS01080100SM001  POAPRA
1   FS01080100SM001  TOXRYD
2   FS01080100SM001     NaN
3   FS01080100SM001     NaN
4   FS01080100SM001   OXRYD
5   FS01080100SM001  SYMOCC
6   FS01080100SM001  EUPESU
7   FS01080100SM001  POAPRA
8   FS01080100SM001  BOUGRA
9   FS01080100SM001  KOEPYR 
10  FS01080100SM002  POAPRA
11  FS01080100SM002  EUPESU
12  FS01080100SM002  POAPRA
13  FS01080100SM002     NaN
14  FS01080100SM002  POAPRA
15  FS01080100SM002  KOEPYR

1 个答案:

答案 0 :(得分:2)

您可以使用melt从宽到长重塑,如下所示:

In [10]: pd.melt(df, id_vars='POLY_KEY_I', value_name='SP')
Out[10]: 
         POLY_KEY_I variable      SP
0   FS01080100SM001      SP1  POAPRA
1   FS01080100SM001      SP1     NaN
2   FS01080100SM001      SP1   OXRYD
3   FS01080100SM001      SP1  EUPESU
4   FS01080100SM001      SP1  BOUGRA
5   FS01080100SM002      SP1  POAPRA
6   FS01080100SM002      SP1  POAPRA
7   FS01080100SM002      SP1  POAPRA
8   FS01080100SM001      SP2  TOXRYD
9   FS01080100SM001      SP2     NaN
10  FS01080100SM001      SP2  SYMOCC
11  FS01080100SM001      SP2  POAPRA
12  FS01080100SM001      SP2  KOEPYR
13  FS01080100SM002      SP2  EUPESU
14  FS01080100SM002      SP2     NaN
15  FS01080100SM002      SP2  KOEPYR