大熊猫的Relabel等级

时间:2015-06-29 17:22:35

标签: python pandas

在pandas DataFrame中,我试图用一个单一的名称重新标记变量的两个级别,但是留下了' Nan'变量中的值不变。

以下是使用“mtcars”的修改版本的可重现示例。数据集。在这里,我想重新回顾一下“是”'并且没有' “我是”的等级变量为' new'例如。

                    mpg   cyl  disp  hp drat    wt  qsec vs  am  
Mazda RX4           21.0  six 160.0 110 3.90 2.620 16.46  0  yes     
Mazda RX4 Wag       21.0  two 160.0 110 3.90 2.875 17.02  0  NaN    
Datsun 710          22.8  six 108.0  93 3.85 2.320 18.61  1  no    
Hornet 4 Drive      21.4  two 258.0 110 3.08 3.215 19.44  1  NaN   
Hornet Sportabout   18.7  six 360.0 175 3.15 3.440 17.02  0  yes  
Valiant             18.1  two 225.0 105 2.76 3.460 20.22  1  NaN   
Duster 360          14.3  two 360.0 245 3.21 3.570 15.84  0  no   

结果如下所示:

                    mpg   cyl  disp  hp drat    wt  qsec vs  am  
Mazda RX4           21.0  six 160.0 110 3.90 2.620 16.46  0  new     
Mazda RX4 Wag       21.0  two 160.0 110 3.90 2.875 17.02  0  NaN    
Datsun 710          22.8  six 108.0  93 3.85 2.320 18.61  1  new    
Hornet 4 Drive      21.4  two 258.0 110 3.08 3.215 19.44  1  NaN   
Hornet Sportabout   18.7  six 360.0 175 3.15 3.440 17.02  0  new  
Valiant             18.1  two 225.0 105 2.76 3.460 20.22  1  NaN   
Duster 360          14.3  two 360.0 245 3.21 3.570 15.84  0  new

2 个答案:

答案 0 :(得分:2)

此处有两种方法,首先假设使用notnull将非NaN值设置为“新”:

In [21]:
df.loc[df['am'].notnull(),'am'] = 'new'
df

Out[21]:
                    mpg  cyl  disp   hp  drat     wt   qsec  vs   am
Mazda RX4          21.0  six   160  110  3.90  2.620  16.46   0  new
Mazda RX4 Wag      21.0  two   160  110  3.90  2.875  17.02   0  NaN
Datsun 710         22.8  six   108   93  3.85  2.320  18.61   1  new
Hornet 4 Drive     21.4  two   258  110  3.08  3.215  19.44   1  NaN
Hornet Sportabout  18.7  six   360  175  3.15  3.440  17.02   0  new
Valiant            18.1  two   225  105  2.76  3.460  20.22   1  NaN
Duster 360         14.3  two   360  245  3.21  3.570  15.84   0  new

另一种方法是使用isin过滤那些将“是”或“否”设置为“新”的行:

In [23]:
df.loc[df['am'].isin(['yes','no']),'am'] = 'new'
df

Out[23]:
                    mpg  cyl  disp   hp  drat     wt   qsec  vs   am
Mazda RX4          21.0  six   160  110  3.90  2.620  16.46   0  new
Mazda RX4 Wag      21.0  two   160  110  3.90  2.875  17.02   0  NaN
Datsun 710         22.8  six   108   93  3.85  2.320  18.61   1  new
Hornet 4 Drive     21.4  two   258  110  3.08  3.215  19.44   1  NaN
Hornet Sportabout  18.7  six   360  175  3.15  3.440  17.02   0  new
Valiant            18.1  two   225  105  2.76  3.460  20.22   1  NaN
Duster 360         14.3  two   360  245  3.21  3.570  15.84   0  new

答案 1 :(得分:1)

尝试:

  mt['am'] = mt.am.map(lambda x: x if pd.isnull(x) else 'new')

输出:

In [21]: df = pd.DataFrame(['yes',np.nan,'no',np.nan], columns=['am'])

In [22]: df
Out[22]: 
    am
0  yes
1  NaN
2   no
3  NaN

In [23]: df['am'] = df.am.map(lambda x: x if pd.isnull(x) else 'new')

In [24]: df
Out[24]: 
    am
0  new
1  NaN
2  new
3  NaN