我正在使用每周数据框,我想总结其中的每日数据。
我有两个数据框:
df:
{'start_date': {0: Timestamp('2018-11-05 00:00:00'),
1: Timestamp('2018-11-12 00:00:00'),
2: Timestamp('2018-11-19 00:00:00'),
3: Timestamp('2018-11-26 00:00:00'),
4: Timestamp('2018-12-03 00:00:00'),
5: Timestamp('2018-12-10 00:00:00'),
6: Timestamp('2018-12-17 00:00:00'),
7: Timestamp('2018-12-24 00:00:00'),
8: Timestamp('2018-12-31 00:00:00'),
9: Timestamp('2019-01-07 00:00:00'),
10: Timestamp('2019-01-14 00:00:00'),
11: Timestamp('2019-01-21 00:00:00'),
12: Timestamp('2019-01-28 00:00:00')},
'woy': {0: 45,
1: 46,
2: 47,
3: 48,
4: 49,
5: 50,
6: 51,
7: 52,
8: 1,
9: 2,
10: 3,
11: 4,
12: 5}}
df_school_vac:
{'timestamp_area_A': {0: Timestamp('2018-12-22 00:00:00'),
1: Timestamp('2018-12-23 00:00:00'),
2: Timestamp('2018-12-24 00:00:00'),
3: Timestamp('2018-12-25 00:00:00'),
4: Timestamp('2018-12-26 00:00:00'),
5: Timestamp('2018-12-27 00:00:00'),
6: Timestamp('2018-12-28 00:00:00'),
7: Timestamp('2018-12-29 00:00:00'),
8: Timestamp('2018-12-30 00:00:00'),
9: Timestamp('2018-12-31 00:00:00'),
10: Timestamp('2019-01-01 00:00:00'),
11: Timestamp('2019-01-02 00:00:00'),
12: Timestamp('2019-01-03 00:00:00'),
13: Timestamp('2019-01-04 00:00:00'),
14: Timestamp('2019-01-05 00:00:00'),
15: Timestamp('2019-01-06 00:00:00')},
'vacation_name': {0: 'Vacances de Noël',
1: 'Vacances de Noël',
2: 'Vacances de Noël',
3: 'Vacances de Noël',
4: 'Vacances de Noël',
5: 'Vacances de Noël',
6: 'Vacances de Noël',
7: 'Vacances de Noël',
8: 'Vacances de Noël',
9: 'Vacances de Noël',
10: 'Vacances de Noël',
11: 'Vacances de Noël',
12: 'Vacances de Noël',
13: 'Vacances de Noël',
14: 'Vacances de Noël',
15: 'Vacances de Noël'},
'woy': {0: 51,
1: 51,
2: 52,
3: 52,
4: 52,
5: 52,
6: 52,
7: 52,
8: 52,
9: 1,
10: 1,
11: 1,
12: 1,
13: 1,
14: 1,
15: 1}}
答案 0 :(得分:1)
请考虑在 df_school_vac 上进行Grouper()
汇总以获取星期一的每周开始计数,然后使用周级别 df 进行左加入merge
: >
agg_df = (df_school_vac.groupby(['vacation_name',
pd.Grouper(key='timestamp_area_A', freq='W-MON')])
.count()
.reset_index()
.set_axis(['holiday_school_name', 'start_date', 'holiday_school_count'],
axis='columns', inplace=False)
)
final_df = (pd.merge(df, agg_df, how='left', on=['start_date'])
.assign(holiday_school = lambda x: np.where(pd.isnull(x['holiday_school_name']),
False, True))
)
print(final_df)
# start_date woy holiday_school_name holiday_school_count holiday_school
# 0 2018-11-05 45 NaN NaN False
# 1 2018-11-12 46 NaN NaN False
# 2 2018-11-19 47 NaN NaN False
# 3 2018-11-26 48 NaN NaN False
# 4 2018-12-03 49 NaN NaN False
# 5 2018-12-10 50 NaN NaN False
# 6 2018-12-17 51 NaN NaN False
# 7 2018-12-24 52 Vacances de Noel 3.0 True
# 8 2018-12-31 1 Vacances de Noel 7.0 True
# 9 2019-01-07 2 Vacances de Noel 6.0 True
# 10 2019-01-14 3 NaN NaN False
# 11 2019-01-21 4 NaN NaN False
# 12 2019-01-28 5 NaN NaN False