我想用字符串'Chaudière'替换数据帧列中的任何字符串,用于以字符串“chaud”开头的任何单词。我想在每个“Chaudiere”之后使用名字和名字消除掉,以匿名化NameDevice
我的数据框名为df1,列名称为NameDevice。
我试过这个:
df1.loc[df['NameDevice'].str.startswith('chaud'), 'NameDevice'] = df1['NameDevice'].str.replace("chaud","Chaudière") . I check with df1.head(), it returns:
IdDevice IdDeviceType SerialDevice NameDevice IdLocation UuidAttributeDevice IdBox IsUpdateDevice
0 119 48 00001 Chaudière Maud Ferrand 4 NaN 4 0
1 120 48 00002 Chaudière Yvan Martinod 6 NaN 6 0
2 121 48 00006 Chaudière Anne-Sophie Premereur 7 NaN 7 0
3 122 48 00005 Chaudière Denis Fauser 8 NaN 8 0
4 123 48 00004 Chaudière Elariak Djilali 3 NaN 3 0
答案 0 :(得分:0)
您可以先调用str.lower
进行匹配,然后在空格上使用str.startswith
,然后只使用split
,并使用第一个条目对数据进行匿名处理:
In [14]:
df.loc[df['NameDevice'].str.lower().str.startswith('chaud'), 'NameDevice'] = df['NameDevice'].str.split().str[0]
df
Out[14]:
IdDevice IdDeviceType SerialDevice NameDevice IdLocation \
0 119 48 1 Chaudière 4
1 120 48 2 Chaudière 6
2 121 48 6 Chaudière 7
3 122 48 5 Chaudière 8
4 123 48 4 Chaudière 3
UuidAttributeDevice IdBox IsUpdateDevice
0 NaN 4 0
1 NaN 6 0
2 NaN 7 0
3 NaN 8 0
4 NaN 3 0
另一种方法是使用str.extract
,因此只需Chaud...
:
In [27]:
df.loc[df['NameDevice'].str.lower().str.startswith('chaud'), 'NameDevice'] = df['NameDevice'].str.extract('(Chaud\w+ )', expand=False)
df
Out[27]:
IdDevice IdDeviceType SerialDevice NameDevice IdLocation \
0 119 48 1 Chaudière 4
1 120 48 2 Chaudière 6
2 121 48 6 Chaudière 7
3 122 48 5 Chaudière 8
4 123 48 4 Chaudière 3
UuidAttributeDevice IdBox IsUpdateDevice
0 NaN 4 0
1 NaN 6 0
2 NaN 7 0
3 NaN 8 0
4 NaN 3 0