Question

我有一个看起来像-df = pd.DataFrame([['10/03/2020', 'H1', 'x', 2.5], ['10/03/2020', 'H2', 'x', 3.5], ['10/03/2020', 'H1', 'y', 2], ['10/03/2020', 'H2', 'y', 3]], columns=['Day', 'Hour', 'Var', 'Val'])

的数据框

          Day Hour Var  Val
0  10/03/2020   H1   x  2.5
1  10/03/2020   H2   x  3.5
2  10/03/2020   H1   y  2.0
3  10/03/2020   H2   y  3.0

我希望结果为-pd.DataFrame([['10/03/2020', 'x', 2.5, 3.5], ['10/03/2020', 'y', 2, 3]], columns=['Day', 'Var', 'H1', 'H2'])

          Day Var   H1   H2
0  10/03/2020   x  2.5  3.5
1  10/03/2020   y  2.0  3.0

在大熊猫中做到这一点的最佳方法是什么？很抱歉，如果这是一个重复的问题。如果是这样，请随时将我指向之前回答的问题

Answer 1

您可以使用pivot方法来完成大部分工作，然后使用reset_index将“天”从索引移到其自己的列中。我使用rename_axis是因为我不喜欢索引列的名称，并且认为它会与新用户混淆：

pivotted_df = (
    df.pivot(index=["Day", "Var"], columns="Hour", values="Val")
    .rename_axis(columns=None)   # Remove the name of the column index. Visual purposes only 
    .reset_index()               # Insert "day" as a column instead of it being the index
)

print(pivotted_df)
          Day Var   H1   H2
0  10/03/2020   x  2.5  3.5
1  10/03/2020   y  2.0  3.0

我鼓励您尝试注释掉rename_axis(...)和reset_index()中的每一行，以查看该过程的每个步骤并直观地看到它们各自在做什么！

熊猫为多列旋转

1 个答案: