Pandas - 从每个组的最大日期减去最小日期

时间:2017-12-27 12:06:07

标签: python pandas group-by

我想添加一个列,该列是从每个customer_id的最大日期减去最小日期到此表的结果

输入:

action_date customer_id
 2017-08-15       1
 2017-08-21       1
 2017-08-21       1
 2017-09-02       1
 2017-08-28       2
 2017-09-29       2
 2017-10-15       3   
 2017-10-30       3
 2017-12-05       3

获取此表

输出:

action_date customer_id    diff
 2017-08-15       1         18
 2017-08-21       1         18
 2017-08-21       1         18
 2017-09-02       1         18
 2017-08-28       2         32
 2017-09-29       2         32
 2017-10-15       3         51
 2017-10-30       3         51
 2017-12-05       3         51

我尝试了这段代码,但它放了很多NaN的

group = df.groupby(by='customer_id')
df['diff'] = (group['action_date'].max() - group['action_date'].min()).dt.days

1 个答案:

答案 0 :(得分:8)

您可以使用transform方法:

In [23]: df['diff'] = df.groupby('customer_id') \
                        ['action_date'] \
                        .transform(lambda x: (x.max()-x.min()).days)

In [24]: df
Out[24]:
  action_date  customer_id  diff
0  2017-08-15            1    18
1  2017-08-21            1    18
2  2017-08-21            1    18
3  2017-09-02            1    18
4  2017-08-28            2    32
5  2017-09-29            2    32
6  2017-10-15            3    51
7  2017-10-30            3    51
8  2017-12-05            3    51