获得每组的天数差异

时间:2018-11-07 23:18:23

标签: python pandas dataframe

df

      group1      group2     date
1     a           b          2017-01-01
2     b           c          2017-01-05
3     d           b          2017-01-07
4     c           a          2017-01-10
5     a           d          2017-01-15

df期望

      group1      group2     grp1_diff_days      grp2_diff_days
1     a           b          NaN                 NaN
2     b           c          4                   NaN
3     d           b          NaN                 2
4     c           a          5                   9
5     a           d          5                   8

我想获取一个组的天数差异,然后将该值放在各自的列['grp i _diff_days']中,而不管它们之前是否来自另一个组。

1 个答案:

答案 0 :(得分:1)

设置 (假设您的date列已经为datetime

df = pd.DataFrame({'group1': {1: 'a', 2: 'b', 3: 'd', 4: 'c', 5: 'a'},
 'group2': {1: 'b', 2: 'c', 3: 'b', 4: 'a', 5: 'd'},
 'date': {1: pd.Timestamp('2017-01-01 00:00:00'),
  2: pd.Timestamp('2017-01-05 00:00:00'),
  3: pd.Timestamp('2017-01-07 00:00:00'),
  4: pd.Timestamp('2017-01-10 00:00:00'),
  5: pd.Timestamp('2017-01-15 00:00:00')}})

s = df.set_index('date').stack().rename('value').reset_index(0)

d = pd.DataFrame(s.groupby('value').date.diff().values.reshape(-1, 2),
                 columns=['g1diff', 'g2diff'],
                 index=df.index)

df.join(d)

  group1 group2       date g1diff g2diff
1      a      b 2017-01-01    NaT    NaT
2      b      c 2017-01-05 4 days    NaT
3      d      b 2017-01-07    NaT 2 days
4      c      a 2017-01-10 5 days 9 days
5      a      d 2017-01-15 5 days 8 days