基于时间的重复熊猫系列

时间:2019-08-29 09:36:08

标签: python pandas

我有一个熊猫数据框,看起来像:

import pandas as pd
import numpy as np

d={'original tenor':[10,10,10,10,10,10,10,10,10,10,10],\
'residual tenor':[5,4,3,2,1,10,9,8,7,6,5],\
'date':(['01/01/2018','02/01/2018','03/01/2018','04/01/2018','05/01/2018','06/01/2018','07/01/2018','08/01/2018','09/01/2018','10/01/2018','11/01/2018'])\
}
df=pd.DataFrame(data=d)
df['date'] = pd.to_datetime(df['date'],format='%d/%m/%Y')

df

剩余期限根据日期减少。当剩余高音变为1时,下一个剩余高音便是原始高音。我正在尝试给出一个公式来填充给定原始男高音和剩余男高音的剩余男高音。因此,鉴于以下数据框,我希望NaN会被5

取代
d={'original tenor':[10,10],\
'residual tenor':[5,np.nan],\
'date':(['01/01/2018','11/01/2018'])\
}
df=pd.DataFrame(data=d)
df['date'] = pd.to_datetime(df['date'],format='%d/%m/%Y')

df

1 个答案:

答案 0 :(得分:1)

必须阅读几次,但是我想,以下代码将产生所需的输出:

import pandas as pd
import numpy as np

d={'original tenor':[10,10],\
'residual tenor':[5,np.nan],\
'date':(['01/01/2018','11/01/2018'])\
}
df=pd.DataFrame(data=d)
df['date'] = pd.to_datetime(df['date'],format='%d/%m/%Y')

df['residual tenor'][1:]=(df['residual tenor'][0]-(df['date'][1:]-df['date'][0])/np.timedelta64(1,'D'))%10

df

numpy仅在此处需要将时差转换为天。


编辑有关OP的评论:

您是否熟悉模运算(Python中的%)?如果数字以某种方式重复,这通常会很有用...稍加思索就会导致以下代码表示另一个停止值:

import pandas as pd
import numpy as np

d={'original tenor':[10, 10, 10, 10, 10, 10],\
'residual tenor':[5, np.nan, np.nan, np.nan, np.nan, np.nan],\
'date':(['01/01/2018', '03/01/2018', '04/01/2018', '05/01/2018', '06/01/2018', '11/01/2018'])\
}

df=pd.DataFrame(data=d)
df['date'] = pd.to_datetime(df['date'], format='%d/%m/%Y')

stoptenor=2
df['residual tenor'][1:]=(df['residual tenor'][0]-(df['date'][1:]-df['date'][0])/np.timedelta64(1,'D')-stoptenor)%(11-stoptenor)+stoptenor

df

由于您的模式仍然重复,但是具有不同的“偏移”(停止位),因此我们必须相应地调整模数。为了提高清晰度,我增加了数据点的数量。