我需要在两个对应ID之间的熊猫数据框中填充缺失值。考虑以下示例:
import pandas as pd
import numpy as np
df = pd.DataFrame({'Date' : pd.date_range('20130101',periods=12),
'ID' :[np.NaN,1,np.NaN,np.NaN,1,np.NaN,2,np.NaN,np.NaN,np.NaN,2,np.NaN]})
所需的输出:
Date ID
0 2013-01-01 NaN
1 2013-01-02 1.0
2 2013-01-03 1.0
3 2013-01-04 1.0
4 2013-01-05 1.0
5 2013-01-06 NaN
6 2013-01-07 2.0
7 2013-01-08 2.0
8 2013-01-09 2.0
9 2013-01-10 2.0
10 2013-01-11 2.0
11 2013-01-12 NaN
我该怎么做?
答案 0 :(得分:3)
比较正向和反向填充值并仅在相同的情况下设置值:
s = df['ID'].ffill()
m = s == df['ID'].bfill()
df.loc[m, 'ID'] = s
#alternative
#df['ID'] = df['ID'].mask(m, s)
print (df)
Date ID
0 2013-01-01 NaN
1 2013-01-02 1.0
2 2013-01-03 1.0
3 2013-01-04 1.0
4 2013-01-05 1.0
5 2013-01-06 NaN
6 2013-01-07 2.0
7 2013-01-08 2.0
8 2013-01-09 2.0
9 2013-01-10 2.0
10 2013-01-11 2.0
11 2013-01-12 NaN