我正在尝试填充数据框中的所有空白。
我尝试过:
df_without_nan = df.fillna("D")
#then I got an error about category so I added this
df_without_nan = df.cat.add_categories("D").fillna("D")
#cat.add only works on a series....
for column in df:
df[column] = df[column]cat.add_categories("D").fillna("D")
这可能很直观...帮助!
答案 0 :(得分:2)
您很近,在DataFrame.select_dtypes
的类别列中循环:
df = pd.DataFrame({
'A':list('abc'),
'B':[np.nan,'A','B'],
'C':[7,8,9],
'D':['X','Y', np.nan],
})
df['B'] = df['B'].astype('category')
print (df)
A B C D
0 a NaN 7 X
1 b A 8 Y
2 c B 9 NaN
#first add category to categorical columns
for column in df.select_dtypes('category'):
df[column] = df[column].cat.add_categories("D")
#then replace all NaNs
df = df.fillna("D")
print (df)
A B C D
0 a D 7 X
1 b A 8 Y
2 c B 9 D