Question

所有df值都是一种热编码，即0/1

尝试

fill_mode = lambda col: col.fillna(col.mode())
df = df.apply(fill_mode, axis=0)
df.isnull().sum()

知道了

id      0
1           0
2           2
3           0

期望所有Null或NAN都填充有模式。

Answer 1

col.mode()返回一个序列，而不是单个数字。因此，col.fillna(col.mode())会尝试将col.mode()的索引与col对齐，并且很可能不会更新任何内容。也许你想做：

fill_mode = lambda col: col.fillna(col.mode()[0])

Answer 2

调整您的fill_mode函数

fill_mode = lambda col: col.fillna(col.mode().iloc[0])
df.apply(fill_mode, axis=0)

mode函数返回一个序列，fillna将在接收该序列时与索引匹配，但是，在您的情况下，我们应删除受影响的索引匹配。

示例

df=pd.DataFrame({'1':[np.nan,2,np.nan],'2':[1,1,np.nan]})

fill_mode = lambda col: col.fillna(col.mode())
print(df.apply(fill_mode, axis=0))
     1    2
0  2.0  1.0 # notice only the first item fill, since the out put of mode is index 0 with value 2
1  2.0  1.0
2  NaN  NaN 

df['1'].mode()
0    2.0
dtype: float64

在这种情况下，df仅填充索引匹配后的第一个值。

我们将添加.iloc进行放行的号码，并将删除与fillna匹配的索引

df['1'].mode().iloc[0]
2.0

用“模式”填充缺失值NAN在熊猫中不起作用

2 个答案: