我有一个带有na值的列,我想根据一个键根据另一个数据帧的值来填充。我想知道是否有任何简单的方法可以这样做。
实施例: 我有一个对象的数据框及其颜色如下:
object color
0 chair black
1 ball yellow
2 door brown
3 ball **NaN**
4 chair white
5 chair **NaN**
6 ball grey
我想用以下数据框中的默认颜色填充颜色列中的na值:
object default_color
0 chair brown
1 ball blue
2 door grey
结果就是这样:
object color
0 chair black
1 ball yellow
2 door brown
3 ball **blue**
4 chair white
5 chair **brown**
6 ball grey
有没有“简单”的方法呢?
谢谢:)
答案 0 :(得分:7)
使用np.where
并通过将列设置为索引进行映射,即
df['color']= np.where(df['color'].isnull(),df['object'].map(df2.set_index('object')['default_color']),df['color'])
或df.where
df['color'] = df['color'].where(df['color'].notnull(), df['object'].map(df2.set_index('object')['default_color']))
object color 0 chair black 1 ball yellow 2 door brown 3 ball blue 4 chair white 5 chair brown 6 ball grey
答案 1 :(得分:4)
首先创建系列,然后替换NaN
s:
s = df1['object'].map(df2.set_index('object')['default_color'])
print (s)
0 brown
1 blue
2 grey
3 blue
4 brown
5 brown
6 blue
Name: object, dtype: object
df1['color']= df1['color'].mask(df1['color'].isnull(), s)
或者:
df1.loc[df1['color'].isnull(), 'color'] = s
或者:
df1['color'] = df1['color'].combine_first(s)
或者:
df1['color'] = df1['color'].fillna(s)
print (df1)
object color
0 chair black
1 ball yellow
2 door brown
3 ball blue
4 chair white
5 chair brown
6 ball grey
如果object
中的唯一值:
df = df1.set_index('object')['color']
.combine_first(df2.set_index('object')['default_color'])
.reset_index()
或者:
df = df1.set_index('object')['color']
.fillna(df2.set_index('object')['default_color'])
.reset_index()
答案 2 :(得分:4)
使用loc
+ map
:
m = df.color.isnull()
df.loc[m, 'color'] = df.loc[m, 'object'].map(df2.set_index('object').default_color)
df
object color
0 chair black
1 ball yellow
2 door brown
3 ball blue
4 chair white
5 chair brown
6 ball grey
如果您要进行大量替换,请拨打set_index
df2
一次并保存其结果。