Question

虚拟数据：

code = ['a','a','a','a','b','b']
serial  =  ['x','y','x','y','x','y']
result = [123,  np.nan, 453, 675, 786, 332]

  code serial  result
0  a    x      123.0 
1  a    y     NaN    
2  a    x      453.0 
3  a    y      675.0 
4  b    x      786.0 
5  b    y      332.0

我想用 675.0 填充 NaN，首先按 code 分组，然后按 serial 并填充 NaN 值

代码：

df['result'] = df['result'].fillna(df.groupby('code')['result'].ffill())

在上面的代码中；我想集成.groupby('serial')

Answer 1

使用 -

df['result'].fillna(df.groupby(['code', 'serial'])['result'].transform('first'))

输出

0    123
1    675
2    453
3    675
4    786
5    332
Name: result, dtype: int64

Answer 2

您可以同时groupby两列：

df['result'] = df.groupby(['code', 'serial'])['result'].bfill()
df

输出：

  code serial  result
0    a      x   123.0
1    a      y   675.0
2    a      x   453.0
3    a      y   675.0
4    b      x   786.0
5    b      y   332.0

附言不过，您需要使用 bfill 而不是 ffill，因为 NaN 出现在组中的第一个值之前

分组两次后填充 NaN 值

2 个答案: