我正在尝试编写一个迭代的for循环,并且每个唯一ID将计算在每个SubmissionStatus中花费的时间(例如待处理的OSPA,Pending Department)并将结果存储在每个字典对应的字典列表中每个唯一ID。花费的时间是通过在状态处于特定阶段时采用最早的LastModified值计算的,并在状态更改为下一阶段时从LastModified值中减去该值(当SubmissionStatus从待处理的OSPA转到待处理部门时,我会采取来自行的LastModified时间戳并减去SubmissionStatus待定OSPA时的最后修改时间戳,例如04/05 / 2018-04 / 01/2018 = 4天+ 04/06 / 2018-04 / 05/2018 = 1天所以总= 5天)
输入是一个pandas数据帧:
ID LastModified SubmissionStatus
0 1 04/01/2018 Pending OSPA
1 1 04/03/2018 Pending OSPA
2 1 04/05/2018 Pending Department
3 1 04/06/2018 Pending OSPA
4 2 04/02/2018 Pending OSPA
5 2 04/03/2018 Pending Department
6 2 04/05/2018 Complete
输出是字典列表:
[
{ ID : 1,
DaysWithOSPA: 5,
DaysWithDepartment: 1},
{ ID : 2,
DaysWithOSPA:1
DaysWithDepartment:2}]
答案 0 :(得分:1)
df.groupby(['ID', 'SubmissionStatus']).sum()