Pandas没有替换数据帧中的字符串

时间:2018-02-21 13:26:30

标签: python pandas dataframe

我已经看到了这个问题,但它并没有为我工作,我相信我犯了一个大错,但请告诉我我做错了什么,我想要价值观" Street",& #34; LandContour"等等将被替换为" pave"到1等等。

python pandas replacing strings in dataframe with numbers

这是我的代码,直到现在:

import numpy as np
import pandas as pd

df=pd.read_csv('train.csv')       # getting file

df.fillna(-99999, inplace=True)

#df.replace("Street", 0, True)    didn't work

# mapping={'Street':1,'LotShape':2,'LandContour':3,'Utilities':4,'SaleCondition':5}

# df.replace('Street', 0)  # didn't work

# df.replace({'Street': mapping, 'LotShape': mapping, 
#            'LandContour': mapping, 'Utilities': mapping,
#            'SaleCondition': mapping})
# didn't work ^
df.head()

我尝试了df['Street'].replace("pave",0,inplace=True)和许多其他事情,但都没有效果。甚至df.replace中给出的参数的单个值也不会被替换。我的df工作正常,它打印头部和特定的颜色,df.fillna也工作正常。任何帮助都会很棒。

编辑:所有未注释的行都在工作,我希望未注释的行能够正常工作。

示例输出为: -

Id  MSSubClass MSZoning  LotFrontage    LotArea     Street   Alley LotShape  \
0   1          60       RL         65.0     8450   Pave  -99999      Reg   
1   2          20       RL         80.0     9600   Pave  -99999      Reg   
2   3          60       RL         68.0    11250   Pave  -99999      IR1   
3   4          70       RL         60.0     9550   Pave  -99999      IR1   
4   5          60       RL         84.0    14260   Pave  -99999      IR1   

  LandContour Utilities    ...     PoolArea  PoolQC   Fence MiscFeature  \
0         Lvl    AllPub    ...            0  -99999  -99999      -99999   
1         Lvl    AllPub    ...            0  -99999  -99999      -99999   
2         Lvl    AllPub    ...            0  -99999  -99999      -99999   
3         Lvl    AllPub    ...            0  -99999  -99999      -99999   
4         Lvl    AllPub    ...            0  -99999  -99999      -99999   

  MiscVal MoSold YrSold  SaleType  SaleCondition  SalePrice  
0       0      2   2008        WD         Normal     208500  
1       0      5   2007        WD         Normal     181500  
2       0      9   2008        WD         Normal     223500  
3       0      2   2006        WD        Abnorml     140000  
4       0     12   2008        WD         Normal     250000  

我也尝试过: -

mapping={'Pave':1,'Lvl':2,'AllPub':3,'Reg':4,'Normal':5,'Abnormal':0,'IR1':6}

#df.replace('Street',0)

df.replace({'Street': mapping, 'LotShape': mapping, 
'LandContour': mapping, 'Utilities': mapping, 'SaleCondition': mapping})

但那也没有用^

1 个答案:

答案 0 :(得分:3)

尝试:

df = pd.read_csv('train.csv')                  # reset
df.fillna(-99999, inplace=True)                # refill
df['Street'].replace('Pave', 0, inplace=True)  # replace

您之前的方法存在的问题是,它们不会使用正确的搜索值将替换应用于正确的列。也要注意资本化。