熊猫将行转换为列

时间:2018-05-22 22:44:06

标签: python pandas dataframe pandas-groupby

我有一个以下列格式生成Dataframe的CSV

--------------------------------------------------------------
|Date       | Fund | TradeGroup | LongShort | Alpha | Details|
--------------------------------------------------------------
|2018-05-22 |A     | TGG-A      | Long      | 3.99  | Misc   |
|2018-05-22 |A     | TGG-B      | Long      | 4.99  | Misc   |
|2018-05-22 |B     | TGG-A      | Long      | 5.99  | Misc   |
|2018-05-22 |B     | TGG-B      | Short     | 6.99  | Misc   |
|2018-05-22 |C     | TGG-A      | Long      | 1.99  | Misc   |
|2018-05-22 |C     | TGG-B      | Long      | 5.29  | Misc   |
--------------------------------------------------------------

我想做的是,将集团贸易集团合并在一起并将基金转换为专栏。因此,最终的数据框应如下所示:

  --------------------------------------------------------
  |TradeGroup| Date      | A         | B         | C     |
  --------------------------------------------------------
  | TGG-A    |2018-05-22 | 3.99      | 5.99      | 1.99  |
  | TGG-B    |2018-05-22 | 4.99      | 6.99      | 5.29  | 
  --------------------------------------------------------

另外,我并不关心LongShort Column和Details Column。所以,如果它们被丢弃也没关系。谢谢!! 我试过了df.pivot(),但它没有提供所需的格式

2 个答案:

答案 0 :(得分:1)

看起来您正在尝试从多索引中取消堆栈。

试试这个:

import pandas as pd

data = '''\
Date        Fund  TradeGroup  LongShort  Alpha  Details
2018-05-22 A      TGG-A       Long       3.99   Misc   
2018-05-22 A      TGG-B       Long       4.99   Misc   
2018-05-22 B      TGG-A       Long       5.99   Misc   
2018-05-22 B      TGG-B       Short      6.99   Misc   
2018-05-22 C      TGG-A       Long       1.99   Misc   
2018-05-22 C      TGG-B       Long       5.29   Misc'''

fileobj = pd.compat.StringIO(data)

df = pd.read_csv(fileobj, sep='\s+')

dfout = df.set_index(['TradeGroup','Date','Fund']).unstack()['Alpha']
print(dfout)

返回:

Fund                      A     B     C
TradeGroup Date                        
TGG-A      2018-05-22  3.99  5.99  1.99
TGG-B      2018-05-22  4.99  6.99  5.29

如果您愿意,您也可以申请.reset_index(),然后获得:

Fund TradeGroup        Date     A     B     C
0         TGG-A  2018-05-22  3.99  5.99  1.99
1         TGG-B  2018-05-22  4.99  6.99  5.29

答案 1 :(得分:0)

使用pd.pivot_table

res = df.pivot_table(index=['Date', 'TradeGroup'], columns='Fund',
                     values='Alpha', aggfunc='first').reset_index()

print(res)

Fund        Date TradeGroup     A     B     C
0     2018-05-22      TGG-A  3.99  5.99  1.99
1     2018-05-22      TGG-B  4.99  6.99  5.29