在pandas DataFrame中,我试图用一个单一的名称重新标记变量的两个级别,但是留下了' Nan'变量中的值不变。
以下是使用“mtcars”的修改版本的可重现示例。数据集。在这里,我想重新回顾一下“是”'并且没有' “我是”的等级变量为' new'例如。
mpg cyl disp hp drat wt qsec vs am
Mazda RX4 21.0 six 160.0 110 3.90 2.620 16.46 0 yes
Mazda RX4 Wag 21.0 two 160.0 110 3.90 2.875 17.02 0 NaN
Datsun 710 22.8 six 108.0 93 3.85 2.320 18.61 1 no
Hornet 4 Drive 21.4 two 258.0 110 3.08 3.215 19.44 1 NaN
Hornet Sportabout 18.7 six 360.0 175 3.15 3.440 17.02 0 yes
Valiant 18.1 two 225.0 105 2.76 3.460 20.22 1 NaN
Duster 360 14.3 two 360.0 245 3.21 3.570 15.84 0 no
结果如下所示:
mpg cyl disp hp drat wt qsec vs am
Mazda RX4 21.0 six 160.0 110 3.90 2.620 16.46 0 new
Mazda RX4 Wag 21.0 two 160.0 110 3.90 2.875 17.02 0 NaN
Datsun 710 22.8 six 108.0 93 3.85 2.320 18.61 1 new
Hornet 4 Drive 21.4 two 258.0 110 3.08 3.215 19.44 1 NaN
Hornet Sportabout 18.7 six 360.0 175 3.15 3.440 17.02 0 new
Valiant 18.1 two 225.0 105 2.76 3.460 20.22 1 NaN
Duster 360 14.3 two 360.0 245 3.21 3.570 15.84 0 new
答案 0 :(得分:2)
此处有两种方法,首先假设使用notnull
将非NaN
值设置为“新”:
In [21]:
df.loc[df['am'].notnull(),'am'] = 'new'
df
Out[21]:
mpg cyl disp hp drat wt qsec vs am
Mazda RX4 21.0 six 160 110 3.90 2.620 16.46 0 new
Mazda RX4 Wag 21.0 two 160 110 3.90 2.875 17.02 0 NaN
Datsun 710 22.8 six 108 93 3.85 2.320 18.61 1 new
Hornet 4 Drive 21.4 two 258 110 3.08 3.215 19.44 1 NaN
Hornet Sportabout 18.7 six 360 175 3.15 3.440 17.02 0 new
Valiant 18.1 two 225 105 2.76 3.460 20.22 1 NaN
Duster 360 14.3 two 360 245 3.21 3.570 15.84 0 new
另一种方法是使用isin
过滤那些将“是”或“否”设置为“新”的行:
In [23]:
df.loc[df['am'].isin(['yes','no']),'am'] = 'new'
df
Out[23]:
mpg cyl disp hp drat wt qsec vs am
Mazda RX4 21.0 six 160 110 3.90 2.620 16.46 0 new
Mazda RX4 Wag 21.0 two 160 110 3.90 2.875 17.02 0 NaN
Datsun 710 22.8 six 108 93 3.85 2.320 18.61 1 new
Hornet 4 Drive 21.4 two 258 110 3.08 3.215 19.44 1 NaN
Hornet Sportabout 18.7 six 360 175 3.15 3.440 17.02 0 new
Valiant 18.1 two 225 105 2.76 3.460 20.22 1 NaN
Duster 360 14.3 two 360 245 3.21 3.570 15.84 0 new
答案 1 :(得分:1)
尝试:
mt['am'] = mt.am.map(lambda x: x if pd.isnull(x) else 'new')
输出:
In [21]: df = pd.DataFrame(['yes',np.nan,'no',np.nan], columns=['am'])
In [22]: df
Out[22]:
am
0 yes
1 NaN
2 no
3 NaN
In [23]: df['am'] = df.am.map(lambda x: x if pd.isnull(x) else 'new')
In [24]: df
Out[24]:
am
0 new
1 NaN
2 new
3 NaN