我有一个数据框,如下所示。
Doctor Appointment Booking_ID
A 2020-01-18 12:00:00 1
A 2020-01-18 12:30:00 2
A 2020-01-18 13:00:00 3
A 2020-01-18 13:00:00 4
B 2020-01-18 12:00:00 5
B 2020-01-18 12:30:00 6
B 2020-01-18 13:00:00 7
B 2020-01-18 13:00:00 8
B 2020-01-18 13:00:00 9
B 2020-01-18 16:30:00 10
A 2020-01-19 12:00:00 11
A 2020-01-19 12:30:00 12
A 2020-01-19 13:00:00 13
A 2020-01-19 13:30:00 14
A 2020-01-19 14:00:00 15
A 2020-01-19 14:00:00 16
A 2020-01-19 14:00:00 17
A 2020-01-19 14:00:00 18
B 2020-01-19 12:00:00 19
B 2020-01-19 12:30:00 20
B 2020-01-19 13:00:00 21
B 2020-01-19 13:30:00 22
B 2020-01-19 14:00:00 23
B 2020-01-19 13:30:00 24
B 2020-01-19 15:00:00 25
B 2020-01-18 15:30:00 26
从上面我想知道同一位医生同一时间的预约数。
预期输出:
Doctor Appointment Booking_ID Number_of_Booking
A 2020-01-18 12:00:00 1 1
A 2020-01-18 12:30:00 2 1
A 2020-01-18 13:00:00 3 2
A 2020-01-18 13:00:00 4 2
B 2020-01-18 12:00:00 5 1
B 2020-01-18 12:30:00 6 1
B 2020-01-18 13:00:00 7 3
B 2020-01-18 13:00:00 8 3
B 2020-01-18 13:00:00 9 3
B 2020-01-18 16:30:00 10 1
A 2020-01-19 12:00:00 11 1
A 2020-01-19 12:30:00 12 1
A 2020-01-19 13:00:00 13 1
A 2020-01-19 13:30:00 14 1
A 2020-01-19 14:00:00 15 4
A 2020-01-19 14:00:00 16 4
A 2020-01-19 14:00:00 17 4
A 2020-01-19 14:00:00 18 4
B 2020-01-19 12:00:00 19 1
B 2020-01-19 12:30:00 20 1
B 2020-01-19 13:00:00 21 1
B 2020-01-19 13:30:00 22 2
B 2020-01-19 14:00:00 23 2
B 2020-01-19 13:30:00 24 2
B 2020-01-19 14:00:00 25 2
B 2020-01-18 15:30:00 26 1
示例:
在时间2020-01-19 13:30:00 B医生有两次预订,如下所示
Doctor Appointment Booking_ID
B 2020-01-19 13:30:00 22
B 2020-01-19 13:30:00 24
所以输出将如下所示
Doctor Appointment Booking_ID Number_of_Booking
B 2020-01-19 13:30:00 22 2
B 2020-01-19 13:30:00 24 2
答案 0 :(得分:2)
首次将GroupBy.transform
与GroupBy.size
一起使用:
df['Number_of_Booking']=df.groupby(['Doctor','Appointment'])['Booking_ID'].transform('size')
print (df.head())
Doctor Appointment Booking_ID Number_of_Booking
0 A 2020-01-18 12:00:00 1 1
1 A 2020-01-18 12:30:00 2 1
2 A 2020-01-18 13:00:00 3 2
3 A 2020-01-18 13:00:00 4 2
4 B 2020-01-18 12:00:00 5 1
对于所有数据中Doctor
和Appointment
的唯一组合,例如样本中的第二个,则分配长度DataFrame
:
df['Number_of_Booking'] = len(df)
print (df)
Doctor Appointment Booking_ID Number_of_Booking
0 B 2020-01-19 13:30:00 22 2
1 B 2020-01-19 13:30:00 24 2