数据
Division Name start_date
A apple 2001-01-05
A banana 2001-03-06
A apple 2001-06-08
A orange 2001-07-09
B peach 2001-01-10
B melon 2001-06-02
B berry 2001-08-19
我需要创建一个结束日期,这是同一部门中下一个人的开始日期。对于观察到的最后一个人,没有结束日期,所以我只想输入今天的日期为2019-04-06。
目标
Division Name start_date end_date
A apple 2001-01-05 2001-03-06
A banana 2001-03-06 2001-06-08
A apple 2001-06-08 2001-07-09
A orange 2001-07-09 2019-04-06
B peach 2001-01-10 2001-06-02
B melon 2001-06-02 2001-08-19
B berry 2001-08-19 2019-04-06
我尝试了
data['end_date'] = data.groupby('Division')['start_date'].index+1
但收到错误消息:
AttributeError: Cannot access attribute 'index' of 'SeriesGroupBy' objects, try using the 'apply' method
有人知道如何解决此问题吗?
非常感谢!
答案 0 :(得分:1)
df['end_date'] = df.groupby('Division').start_date.shift(-1)
然后只用今天的日期fillna()
df = df.fillna(datetime.date.today())
Division Name start_date end_date
0 A apple 2001-01-05 2001-03-06
1 A banana 2001-03-06 2001-06-08
2 A apple 2001-06-08 2001-07-09
3 A orange 2001-07-09 2019-04-06
4 B peach 2001-01-10 2001-06-02
5 B melon 2001-06-02 2001-08-19
6 B berry 2001-08-19 2019-04-06