如何在pandas / numpy中将值扩展到下一个非null?

时间:2017-12-03 20:59:01

标签: python pandas numpy

我有Series这样:

>>> s = pd.Series([1,0,0,3,0,5,0,0,0])
>>> s[s==0] = pd.np.nan
>>> s
0    1.0
1    NaN
2    NaN
3    3.0
4    NaN
5    5.0
6    NaN
7    NaN
8    NaN
dtype: float64

我希望延伸'值,如下所示:

>>> t = s.shift()
>>> for _ in range(100000):
...     s[s.isnull()] = t
...     if not s.isnull().any():
...             break
...     t = t.shift()
...
>>> s
0    1.0
1    1.0
2    1.0
3    3.0
4    3.0
5    5.0
6    5.0
7    5.0
8    5.0
dtype: float64

但我喜欢更具矢量化和效率的东西。我该怎么做?

2 个答案:

答案 0 :(得分:4)

您正在寻找fillna

>>> s.fillna(method='ffill')
0    1.0
1    1.0
2    1.0
3    3.0
4    3.0
5    5.0
6    5.0
7    5.0
8    5.0
dtype: float64
>>>

答案 1 :(得分:1)

基于np.maximum.accumulate -

的NumPy前向填充
def numpy_ffill(s):
    arr = s.values
    mask = np.isnan(arr)
    idx = np.where(~mask,np.arange(len(mask)),0)
    out = arr[np.maximum.accumulate(idx)]
    return pd.Series(out)

示例运行 -

In [41]: s
Out[41]: 
0    1.0
1    NaN
2    NaN
3    3.0
4    NaN
5    5.0
6    NaN
7    NaN
8    NaN
dtype: float64

In [42]: numpy_ffill(s)
Out[42]: 
0    1.0
1    1.0
2    1.0
3    3.0
4    3.0
5    5.0
6    5.0
7    5.0
8    5.0
dtype: float64