每周一重新采样

时间:2021-05-05 09:22:50

标签: python pandas pandas-resample

我想按 ID 和日期对每周(星期一)独立汇总的两列求和。 .

df = pd.DataFrame({'id':['x2', 'x2', 'x1', 'x1', 'x1'],
            'date':['2021-01-03','2021-01-09', '2021-01-02', '2021-01-01', '2021-01-01'],
       'distance_europe':[100, 100, 200, 200, 100],
                'distance_USA': [0, 200, 100, 100, 3]})

预期输出

2020-12-28  x2  100   0
2020-12-28  x1 500   203
2021-01-04  x2  100  200

更新---- 我下面的解决方案运行良好 ---------

df = df.groupby('id').resample('W', on='date').sum().reset_index()
df['due_date'] = df.date - pd.Timedelta('6D')

1 个答案:

答案 0 :(得分:1)

Grouperclosed='left' 一起使用,将 label='left'id 一起使用并聚合 sum

df = pd.DataFrame({'id':['x2', 'x2', 'x1', 'x1', 'x1'],
            'date':['2021-01-03','2021-01-09', '2021-01-02', '2021-01-01', '2021-01-01'],
       'distance_europe':[100, 100, 200, 200, 100],
                'distance_USA': [0, 200, 100, 100, 3]})


df['date'] = pd.to_datetime(df['date'])

df = (df.groupby([pd.Grouper(key='date', freq='W-Mon', closed='left', label='left'), 'id'])
       .sum()
       .reset_index())
print (df)
        date  id  distance_europe  distance_USA
0 2020-12-28  x1              500           203
1 2020-12-28  x2              100             0
2 2021-01-04  x2              100           200

您的解决方案是使用与上述相同的参数将分组从 Sundays (for W is default Sundays) 更改为 Mondays

df = df.groupby('id').resample('W-Mon', on='date', closed='left', label='left').sum().reset_index()

print (df)
   id       date  distance_europe  distance_USA
0  x1 2020-12-28              500           203
1  x2 2020-12-28              100             0
2  x2 2021-01-04              100           200
相关问题