如何合并行并将其转换为列

时间:2016-11-20 22:43:14

标签: python pandas

我的数据框如下:

ID  START   END  SEQ
1   11      12   1
1   14      15   3 
1   13      14   2
2   10      14   1
3   11      15   1
3   16      17   2

我需要将其转换为此DataFrame:

ID  START_1  END_1  SEQ_1  START_2  END_2  SEQ_2 START_3  END_3  SEQ_3
1   11       12     1      13       14     2     14       15     3 
2   10       14     1      NA       NA     NA    NA       NA     NA   
3   11       15     1      16       17     2     NA       NA     NA 

问题是具有相同ID的行数是未知的,这意味着不应手动定义最大列数START_XEND_XSEQ_X。 是否有任何自动方式进行此转换,考虑到列应按SEQ排序? 我应该使用group_by还是哪种方法?

1 个答案:

答案 0 :(得分:1)

您可以将groupbyunstack一起使用,然后sort_index使用MultiIndex,并在list comprehension之后的列中删除df['SEQ1'] = df.SEQ df = df.groupby(['ID','SEQ1']).mean().unstack() df = df.sort_index(axis=1, level=1) df.columns = ['_'.join((col[0], str(col[1]))) for col in df.columns] print (df) START_1 END_1 SEQ_1 START_2 END_2 SEQ_2 START_3 END_3 SEQ_3 ID 1 11.0 12.0 1.0 13.0 14.0 2.0 14.0 15.0 3.0 2 10.0 14.0 1.0 NaN NaN NaN NaN NaN NaN 3 11.0 15.0 1.0 16.0 17.0 2.0 NaN NaN NaN

aggfunc='mean'

默认情况下,使用pivot_tabledf['SEQ1'] = df.SEQ df = df.pivot_table(index= ['ID','SEQ1']).unstack() df = df.sort_index(axis=1, level=1) df.columns = ['_'.join((col[0], str(col[1]))) for col in df.columns] print (df) END_1 SEQ_1 START_1 END_2 SEQ_2 START_2 END_3 SEQ_3 START_3 ID 1 12.0 1.0 11.0 14.0 2.0 13.0 15.0 3.0 14.0 2 14.0 1.0 10.0 NaN NaN NaN NaN NaN NaN 3 15.0 1.0 11.0 17.0 2.0 16.0 NaN NaN NaN 的另一个解决方案是:

jars[0]