我想按 ID 和日期对每周(星期一)独立汇总的两列求和。 .
df = pd.DataFrame({'id':['x2', 'x2', 'x1', 'x1', 'x1'],
'date':['2021-01-03','2021-01-09', '2021-01-02', '2021-01-01', '2021-01-01'],
'distance_europe':[100, 100, 200, 200, 100],
'distance_USA': [0, 200, 100, 100, 3]})
预期输出
2020-12-28 x2 100 0
2020-12-28 x1 500 203
2021-01-04 x2 100 200
更新---- 我下面的解决方案运行良好 ---------
df = df.groupby('id').resample('W', on='date').sum().reset_index()
df['due_date'] = df.date - pd.Timedelta('6D')
答案 0 :(得分:1)
将 Grouper
与 closed='left'
一起使用,将 label='left'
与 id
一起使用并聚合 sum
:
df = pd.DataFrame({'id':['x2', 'x2', 'x1', 'x1', 'x1'],
'date':['2021-01-03','2021-01-09', '2021-01-02', '2021-01-01', '2021-01-01'],
'distance_europe':[100, 100, 200, 200, 100],
'distance_USA': [0, 200, 100, 100, 3]})
df['date'] = pd.to_datetime(df['date'])
df = (df.groupby([pd.Grouper(key='date', freq='W-Mon', closed='left', label='left'), 'id'])
.sum()
.reset_index())
print (df)
date id distance_europe distance_USA
0 2020-12-28 x1 500 203
1 2020-12-28 x2 100 0
2 2021-01-04 x2 100 200
您的解决方案是使用与上述相同的参数将分组从 Sundays
(for W is default Sundays)
更改为 Mondays
:
df = df.groupby('id').resample('W-Mon', on='date', closed='left', label='left').sum().reset_index()
print (df)
id date distance_europe distance_USA
0 x1 2020-12-28 500 203
1 x2 2020-12-28 100 0
2 x2 2021-01-04 100 200