熊猫-根据另一个df索引对行重新采样

时间:2020-11-09 10:00:18

标签: python pandas timestamp pandas-resample

我有一个datframe看起来像这样:

zone            Datetime        Demand
 48  2020-08-02 00:00:00  14292.550740
 48  2020-08-02 01:00:00  14243.490740
 48  2020-08-02 02:00:00   9130.840744
 48  2020-08-02 03:00:00  10483.510740
 48  2020-08-02 04:00:00  10014.970740

我想根据另一个df索引重新采样(求和)需求值,如下所示:

2020-08-02 03:00:00
2020-08-02 06:00:00
2020-08-02 07:00:00
2020-08-02 10:00:00

处理此问题的最佳方法是什么?

1 个答案:

答案 0 :(得分:0)

我相信您需要merge_asof

print (df2)
                     a
2020-08-02 03:00:00  1
2020-08-02 06:00:00  2
2020-08-02 07:00:00  3
2020-08-02 10:00:00  4

df1['Datetime'] = pd.to_datetime(df1['Datetime'])
df2.index = pd.to_datetime(df2.index)

df = pd.merge_asof(df1, 
                   df2.rename_axis('date2').reset_index(), 
                   left_on='Datetime', 
                   right_on='date2', 
                   direction='forward'
                   )
print (df)
   zone            Datetime        Demand               date2  a
0    48 2020-08-02 00:00:00  14292.550740 2020-08-02 03:00:00  1
1    48 2020-08-02 01:00:00  14243.490740 2020-08-02 03:00:00  1
2    48 2020-08-02 02:00:00   9130.840744 2020-08-02 03:00:00  1
3    48 2020-08-02 03:00:00  10483.510740 2020-08-02 03:00:00  1
4    48 2020-08-02 04:00:00  10014.970740 2020-08-02 06:00:00  2

然后聚合sum,例如如果两栏都需要:

df = df.groupby(['zone','date2'], as_index=False)['Demand'].sum()
print (df)
   zone               date2        Demand
0    48 2020-08-02 03:00:00  48150.392964
1    48 2020-08-02 06:00:00  10014.970740
相关问题