python替换特定数据帧列中的字符串

时间:2017-03-06 10:23:11

标签: python pandas dataframe

我想用字符串'Chaudière'替换数据帧列中的任何字符串,用于以字符串“chaud”开头的任何单词。我想在每个“Chaudiere”之后使用名字和名字消除掉,以匿名化NameDevice

我的数据框名为df1,列名称为NameDevice。

我试过这个:

   df1.loc[df['NameDevice'].str.startswith('chaud'), 'NameDevice'] = df1['NameDevice'].str.replace("chaud","Chaudière") . I check with df1.head(), it returns:   

IdDevice    IdDeviceType    SerialDevice    NameDevice  IdLocation  UuidAttributeDevice IdBox   IsUpdateDevice
0            119    48       00001         Chaudière Maud Ferrand   4   NaN 4   0
1            120    48       00002         Chaudière Yvan Martinod  6   NaN 6   0
2            121    48       00006         Chaudière Anne-Sophie Premereur  7   NaN 7   0
3            122    48       00005         Chaudière Denis Fauser   8   NaN 8   0
4            123    48       00004         Chaudière Elariak Djilali    3   NaN 3   0

1 个答案:

答案 0 :(得分:0)

您可以先调用str.lower进行匹配,然后在空格上使用str.startswith,然后只使用split,并使用第一个条目对数据进行匿名处理:

In [14]:
df.loc[df['NameDevice'].str.lower().str.startswith('chaud'), 'NameDevice'] = df['NameDevice'].str.split().str[0]
df

Out[14]:
   IdDevice  IdDeviceType  SerialDevice NameDevice  IdLocation  \
0       119            48             1  Chaudière           4   
1       120            48             2  Chaudière           6   
2       121            48             6  Chaudière           7   
3       122            48             5  Chaudière           8   
4       123            48             4  Chaudière           3   

   UuidAttributeDevice  IdBox  IsUpdateDevice  
0                  NaN      4               0  
1                  NaN      6               0  
2                  NaN      7               0  
3                  NaN      8               0  
4                  NaN      3               0  

另一种方法是使用str.extract,因此只需Chaud...

In [27]:
df.loc[df['NameDevice'].str.lower().str.startswith('chaud'), 'NameDevice'] = df['NameDevice'].str.extract('(Chaud\w+ )', expand=False)
df

Out[27]:
   IdDevice  IdDeviceType  SerialDevice  NameDevice  IdLocation  \
0       119            48             1  Chaudière            4   
1       120            48             2  Chaudière            6   
2       121            48             6  Chaudière            7   
3       122            48             5  Chaudière            8   
4       123            48             4  Chaudière            3   

   UuidAttributeDevice  IdBox  IsUpdateDevice  
0                  NaN      4               0  
1                  NaN      6               0  
2                  NaN      7               0  
3                  NaN      8               0  
4                  NaN      3               0