给出数据帧df:
df = pd.DataFrame({'Store_ID': [1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1],
'Week_ID': [1,1,1,1,1,1,1, 2,2,2,2,2,2,2, 3,3,3,3,3,3,3],
'Day': ['Mo','Tu','We','Th','Fr','Sa','Su','Mo','Tu','We','Th','Fr','Sa','Su','Mo','Tu','We','Th','Fr','Sa','Su'],
'Manager': ['Kev','Kev','Nash','Kev','Kev','Nash','Kev','Kev','Nash','Kev','Kev','Nash','Kev','Kev','Nash','Kev','Kev','Nash','Kev','Kev','Nash'],
'Store_Opener': ['Jev','Jev','Oash','Kev','Kev','Nash','Jev','Jev','Oash','Kev','Kev','Nash','Jev','Jev','Oash','Kev','Kev','Nash','Kev','Kev','Nash']
})
我想毫无保留地喜欢df1之类的东西。 (而且,我想知道是否可以反向操作或旋转回df)
df1 = pd.DataFrame({'Store_ID': [1,1,1],
'Week_ID': [1,2,3],
'Day_Mo_Manager':['Kev','Kev','Nash'],
'Day_Tu_Manager':['Kev','Nash','Kev'],
'Day_We_Manager':['?','?','?'],
'Day_Th_Manager':['?','?','?'],
'Day_Fr_Manager':['?','?','?'],
'Day_Sa_Manager':['?','?','?'],
'Day_Su_Manager':['?','?','?'],
'Day_Mo_Store_Opener':['Jev','Jev','Oash'],
'Day_Tu_Store_Opener':['Jev','Oash','Jev'],
'Day_We_Store_Opener':['?','?','?'],
'Day_Th_Store_Opener':['?','?','?'],
'Day_Fr_Store_Opener':['?','?','?'],
'Day_Sa_Store_Opener':['?','?','?'],
'Day_Su_Store_Opener':['?','?','?'],
})
是否有某种方法可以旋转表并取消旋转表,如图所示? 受Partial Pivoting In Pandas SQL Or Spark的启发 我尝试过
df.set_index(['Store_ID','Week_ID'])['Manager']。unstack()
df.pivot_table(index ='Store_ID',columns ='Week_ID',values ='Manager')
但是给出了一些错误。
答案 0 :(得分:5)
您可以尝试以下方法:
df_out = df.set_index(['Store_ID','Week_ID','Day']).unstack(-1)
df_out.columns = [f'Day_{j}_{i}' for i, j in df_out.columns]
df_out
输出:
Day_Fr_Manager Day_Mo_Manager Day_Sa_Manager Day_Su_Manager \
Store_ID Week_ID
1 1 Kev Kev Nash Kev
2 Nash Kev Kev Kev
3 Kev Nash Kev Nash
Day_Th_Manager Day_Tu_Manager Day_We_Manager \
Store_ID Week_ID
1 1 Kev Kev Nash
2 Kev Nash Kev
3 Nash Kev Kev
Day_Fr_Store_Opener Day_Mo_Store_Opener Day_Sa_Store_Opener \
Store_ID Week_ID
1 1 Kev Jev Nash
2 Nash Jev Jev
3 Kev Oash Kev
Day_Su_Store_Opener Day_Th_Store_Opener Day_Tu_Store_Opener \
Store_ID Week_ID
1 1 Jev Kev Jev
2 Jev Kev Oash
3 Nash Nash Kev
Day_We_Store_Opener
Store_ID Week_ID
1 1 Oash
2 Kev
3 Kev
此外,如果您想保留日间订单,请使用pd.Categorical:
df['Day'] = pd.Categorical(df['Day'], df['Day'].unique(), ordered=True)
df_out = df.set_index(['Store_ID','Week_ID','Day']).unstack(-1)
df_out.columns = [f'Day_{j}_{i}' for i, j in df_out.columns]
df_out
输出:
Day_Mo_Manager Day_Tu_Manager Day_We_Manager Day_Th_Manager \
Store_ID Week_ID
1 1 Kev Kev Nash Kev
2 Kev Nash Kev Kev
3 Nash Kev Kev Nash
Day_Fr_Manager Day_Sa_Manager Day_Su_Manager \
Store_ID Week_ID
1 1 Kev Nash Kev
2 Nash Kev Kev
3 Kev Kev Nash
Day_Mo_Store_Opener Day_Tu_Store_Opener Day_We_Store_Opener \
Store_ID Week_ID
1 1 Jev Jev Oash
2 Jev Oash Kev
3 Oash Kev Kev
Day_Th_Store_Opener Day_Fr_Store_Opener Day_Sa_Store_Opener \
Store_ID Week_ID
1 1 Kev Kev Nash
2 Kev Nash Jev
3 Nash Kev Kev
Day_Su_Store_Opener
Store_ID Week_ID
1 1 Jev
2 Jev
3 Nash
并重新调整为原始形状。
#Use str accessor and slicing to strip 'Day_' from columns then split on first '_'.
#Unzip and use from_arrays to re-create MultiIndex.
df_out.columns = pd.MultiIndex.from_arrays((zip(*df_out.columns.str[4:].str.split('_',1))))
#Stack level=0 of MultiIndex column header into the dataframe index
df_out.stack(0).reset_index()
输出:
Store_ID Week_ID level_2 Manager Store_Opener
0 1 1 Fr Kev Kev
1 1 1 Mo Kev Jev
2 1 1 Sa Nash Nash
3 1 1 Su Kev Jev
4 1 1 Th Kev Kev
5 1 1 Tu Kev Jev
6 1 1 We Nash Oash
7 1 2 Fr Nash Nash
8 1 2 Mo Kev Jev
9 1 2 Sa Kev Jev
10 1 2 Su Kev Jev
11 1 2 Th Kev Kev
12 1 2 Tu Nash Oash
13 1 2 We Kev Kev
14 1 3 Fr Kev Kev
15 1 3 Mo Nash Oash
16 1 3 Sa Kev Kev
17 1 3 Su Nash Nash
18 1 3 Th Nash Nash
19 1 3 Tu Kev Kev
20 1 3 We Kev Kev