给出以下数据框:
import pandas as pd
import numpy as np
df = pd.DataFrame({'Site':['A','A','A','B','B','B','C','C','C'],
'Value':[np.nan,1,np.nan,np.nan,2,2,3,np.nan,3]})
df
Site Value
0 A NaN
1 A 1.0
2 A NaN
3 B NaN
4 B 2.0
5 B 2.0
6 C 3.0
7 C NaN
8 C 3.0
我想用网站上最常见的(中位数或平均值)值来填充NaN值。期望的结果是:
Site Value
0 A 1.0
1 A 1.0
2 A 1.0
3 B 2.0
4 B 2.0
5 B 2.0
6 C 3.0
7 C 3.0
8 C 3.0
提前致谢!
更新:这很接近,但没有雪茄:
df['Value']=df.groupby(['Site'])['Value'].fillna(min)
导致......
Site Value
0 A <function amax at 0x108cf9048>
1 A 1
2 A <function amax at 0x108cf9048>
3 B <function amax at 0x108cf9048>
4 B 2
5 B 2
6 C 3
7 C <function amax at 0x108cf9048>
8 C 3
答案 0 :(得分:1)
您可以使用transform
作为已解答的here
df['Value'] = df.groupby('Site').transform(lambda x: x.fillna(x.mean()))
Site Value
0 A 1
1 A 1
2 A 1
3 B 2
4 B 2
5 B 2
6 C 3
7 C 3
8 C 3