带有日期列的熊猫差异

时间:2019-05-19 08:11:20

标签: python pandas

我有一个看起来像这样的数据框:

d={'business':['FX','FX','IR','IR'],\
   'name':['ed','ed','a','b'],\
   'date':(['01/01/2018','05/02/2018','01/01/2018','05/01/2018']),\
   'amt':[1,2,3,4]}
df=pd.DataFrame(data=d)
df['date'] = pd.to_datetime(df['date'],format='%d/%m/%Y')
df

我正在尝试使用diff()函数获得一个显示两个日期之间差异的新列。我需要的最终输出是:

df['date diff']=[0,4,0,0]

注意:diff()函数将导致大于0的Nan正常。

1 个答案:

答案 0 :(得分:1)

我相信您需要DataFrameGroupBy.diff

df['date diff'] = df.groupby(['business','name'])['amt'].diff().fillna(0).astype(int)
print(df)
  business name       date  amt  date diff
0       FX   ed 2018-01-01    1          0
1       FX   ed 2018-02-05    5          4
2       IR    a 2018-01-01  101          0
3       IR    b 2018-01-05  105          0

编辑:

df = df.sort_values(['business','date'])
df['date diff'] = df.groupby(['business'])['date'].diff().dt.days.fillna(0).astype(int)
print(df)
  business name       date  amt  date diff
0       FX   ed 2018-01-01    1          0
1       FX   ed 2018-02-05    5         35
2       IR    a 2018-01-01  101          0
3       IR    b 2018-01-05  105          4