Question

我想按如下所示旋转数据框：

我有一个长小时格式的小时数据框，格式为V1

我想为变量V1的每一年创建一列

 date  = ['2015-02-03 21:00:00','2015-02-03 22:30:00','2016-02-03 21:00:00','2016-02-03   22:00:00']
 value_column = [33.24  , 500  , 34.39  , 34.49 ]

 df = pd.DataFrame({'V1':value_column}, index=pd.to_datetime(date))

 print(df.head())

                    V1 
 index                                     
 2015-02-03 21:00:00  33.24   
 2015-02-03 22:30:00  500   
 2016-02-03 21:00:00  34.39   
 2016-02-03 22:00:00  34.49

预期结果：

                V1_2015  V1_2016
02-03 21:00:00    33.24     33.49
02-03 22:00:00    500       33.49

到目前为止，我尝试了此操作，但此操作却使我离此很近：

df['year'] = df.index.year
df=df.set_index(['year'],append=True)
df=df.unstack(level=1)

                         V1
                         2015    2016
    2015-02-03 21:00:00  33.24   
    2015-02-03 22:00:00  500   
    2016-02-03 21:00:00          34.39   
    2016-02-03 22:00:00          34.49

基本上，我想调整月份的小时数，以便可以比较不同年份的V1变量。知道如何有效地做到这一点吗？

谢谢

Answer 1

尝试将索引转换为字符串，这样就可以删除年份和分钟：

df['year']=df.index.year

df.reset_index(inplace=True)

df['index']=df['index'].astype(str).apply(lambda x: x[5:13])

df.set_index('index',inplace=True)
df.pivot(columns='year')

输出：

              V1       
year        2015   2016
index                  
02-03 21   33.24  34.39
02-03 22  500.00  34.49

熊猫按年透视数据框架

1 个答案: