Question

我有一列这样的内容

valueCount
0.0
nan
2.0
1.0
1.0
1.0
nan
nan
nan
4.0

我想根据可用的下一个值（加）或上一个值（减）来填充。所以结果应该是

valueCount
0.0
**1.0**
2.0
1.0
1.0
1.0
**1.0**
**2.0**
**3.0**
4.0

我知道这是非常有条件的，如果我以前的值是0，我可以将+1加到nan行，否则我应该从0,1,2开始加，依此类推。

我可以在简单的python列表中执行此算法，但是在熊猫中，有什么简单的方法吗？

Answer 1

您可以使用：

a = df['valueCount'].isnull()
b = a.cumsum()
c = df['valueCount'].bfill()
d = c + (b-b.mask(a).bfill().fillna(0).astype(int)).sub(1)
df['valueCount'] =  df['valueCount'].fillna(d)
print (df)

   valueCount
0         0.0
1         1.0
2         2.0
3         1.0
4         1.0
5         1.0
6         1.0
7         2.0
8         3.0
9         4.0

详细信息 + 说明：

#back filling NaN values
x = df['valueCount'].bfill()
#compare by NaNs
a = df['valueCount'].isnull()
#cumulative sum of mask
b = a.cumsum()
#replace Trues to NaNs
c = b.mask(a)
#forward fill NaNs
d = b.mask(a).bfill()
#First NaNs to 0 and cast to integers
e = b.mask(a).bfill().fillna(0).astype(int)
#add to backfilled Series cumulative sum and subtract from cumulative sum Series, 1
f = x + b - e - 1
#replace NaNs by Series f
g = df['valueCount'].fillna(f)
df = pd.concat([df['valueCount'], x, a, b, c, d, e, f, g], axis=1, 
               keys=('orig','x','a','b','c','d','e', 'f', 'g'))
print (df)
   orig    x      a  b    c    d  e    f    g
0   0.0  0.0  False  0  0.0  0.0  0 -1.0  0.0
1   NaN  2.0   True  1  NaN  1.0  1  1.0  1.0
2   2.0  2.0  False  1  1.0  1.0  1  1.0  2.0
3   1.0  1.0  False  1  1.0  1.0  1  0.0  1.0
4   1.0  1.0  False  1  1.0  1.0  1  0.0  1.0
5   1.0  1.0  False  1  1.0  1.0  1  0.0  1.0
6   NaN  4.0   True  2  NaN  4.0  4  1.0  1.0
7   NaN  4.0   True  3  NaN  4.0  4  2.0  2.0
8   NaN  4.0   True  4  NaN  4.0  4  3.0  3.0
9   4.0  4.0  False  4  4.0  4.0  4  3.0  4.0

Answer 2

在这种情况下，您还可以使用插值和四舍五入。但这不适用于所有数据集。

考虑：

import pandas as pd
import numpy as np
s = pd.Series([0, np.nan, 2, 1, 1, 1, np.nan, np.nan, np.nan, 4])

然后np.floor(s.interpolate())给出

0    0.0
1    1.0
2    2.0
3    1.0
4    1.0
5    1.0
6    1.0
7    2.0
8    3.0
9    4.0
dtype: float64

基于上一行值的熊猫fillna

2 个答案: