假设我有以下客户数据表
df = pd.DataFrame.from_dict({"Customer":[0,0,1],
"Date":['01.01.2016', '01.02.2016', '01.01.2016'],
"Type":["First Buy", "Second Buy", "First Buy"],
"Value":[10,20,10]})
看起来像这样:
Customer | Date | Type | Value
-----------------------------------------
0 |01.01.2016|First Buy | 10
-----------------------------------------
0 |01.02.2016|Second Buy| 20
-----------------------------------------
1 |01.01.2016|First Buy | 10
我想通过Type列来旋转表。 但是,旋转仅会将数值值列作为结果。 我想要一个像这样的结构:
Customer | First Buy Date | First Buy Value | Second Buy Date | Second Buy Value
---------------------------------------------------------------------------------
缺少的值是NAN或NAT 这是否可以使用pivot_table。如果没有,我可以想象一些解决方法,但它们非常长。还有其他建议吗?
答案 0 :(得分:6)
使用unstack
:
df1 = df.set_index(['Customer', 'Type']).unstack()
df1.columns = ['_'.join(cols) for cols in df1.columns]
print (df1)
Date_First Buy Date_Second Buy Value_First Buy Value_Second Buy
Customer
0 01.01.2016 01.02.2016 10.0 20.0
1 01.01.2016 None 10.0 NaN
如果需要其他订单的列,请使用swaplevel
和sort_index
:
df1 = df.set_index(['Customer', 'Type']).unstack()
df1.columns = ['_'.join(cols) for cols in df1.columns.swaplevel(0,1)]
df1.sort_index(axis=1, inplace=True)
print (df1)
First Buy_Date First Buy_Value Second Buy_Date Second Buy_Value
Customer
0 01.01.2016 10.0 01.02.2016 20.0
1 01.01.2016 10.0 None NaN