跨日期插入多列

时间:2019-08-22 18:44:12

标签: python pandas

我有一个看起来像这样的数据框:

import pandas as pd
import numpy as np

d={'business':['FX','FX','FX','FX','IR','IR','IR','IR'],\
'A/L':['A','A','A','A','A','A','A','A'],\
'date':(['01/01/2018','02/01/2018','03/01/2018','04/01/2018',\
'05/01/2018','06/01/2018','06/01/2019','06/01/2020']),\
'amt':[1,2,3,4,5,np.nan,7,8]}
df=pd.DataFrame(data=d)
df['date'] = pd.to_datetime(df['date'],format='%d/%m/%Y')
df.set_index('date',inplace=True)
df=df.groupby('business').apply(pd.Series.interpolate)
df

我想对上述数据进行插值,但要在插值中包括日期。因此,考虑到两行之间存在1年的“差距”,我本来期望的数字不是当前的6,而是接近5。你知道怎么做吗?

1 个答案:

答案 0 :(得分:2)

将“日期”列设置为索引后,您可以指定用于插值到index的方法,例如:

print (df.set_index('date')
         .groupby('business')
         .apply(lambda x: x.interpolate(method = 'index'))
         .reset_index())

        date business A/L       amt
0 2018-01-01       FX   A  1.000000
1 2018-01-02       FX   A  2.000000
2 2018-01-03       FX   A  3.000000
3 2018-01-04       FX   A  4.000000
4 2018-01-05       IR   A  5.000000
5 2018-01-06       IR   A  5.005464
6 2019-01-06       IR   A  7.000000
7 2020-01-06       IR   A  8.000000