我有一个datframe看起来像这样:
zone Datetime Demand
48 2020-08-02 00:00:00 14292.550740
48 2020-08-02 01:00:00 14243.490740
48 2020-08-02 02:00:00 9130.840744
48 2020-08-02 03:00:00 10483.510740
48 2020-08-02 04:00:00 10014.970740
我想根据另一个df索引重新采样(求和)需求值,如下所示:
2020-08-02 03:00:00
2020-08-02 06:00:00
2020-08-02 07:00:00
2020-08-02 10:00:00
处理此问题的最佳方法是什么?
答案 0 :(得分:0)
我相信您需要merge_asof
:
print (df2)
a
2020-08-02 03:00:00 1
2020-08-02 06:00:00 2
2020-08-02 07:00:00 3
2020-08-02 10:00:00 4
df1['Datetime'] = pd.to_datetime(df1['Datetime'])
df2.index = pd.to_datetime(df2.index)
df = pd.merge_asof(df1,
df2.rename_axis('date2').reset_index(),
left_on='Datetime',
right_on='date2',
direction='forward'
)
print (df)
zone Datetime Demand date2 a
0 48 2020-08-02 00:00:00 14292.550740 2020-08-02 03:00:00 1
1 48 2020-08-02 01:00:00 14243.490740 2020-08-02 03:00:00 1
2 48 2020-08-02 02:00:00 9130.840744 2020-08-02 03:00:00 1
3 48 2020-08-02 03:00:00 10483.510740 2020-08-02 03:00:00 1
4 48 2020-08-02 04:00:00 10014.970740 2020-08-02 06:00:00 2
然后聚合sum
,例如如果两栏都需要:
df = df.groupby(['zone','date2'], as_index=False)['Demand'].sum()
print (df)
zone date2 Demand
0 48 2020-08-02 03:00:00 48150.392964
1 48 2020-08-02 06:00:00 10014.970740