我有一个各种值的列表,我需要用一个值替换(Drive-by)。我做了我的研究,但我能找到的最接近的帖子是下面附带的链接,它没有使用熊猫。实现这一目标的最可行方法是什么?
Python replace multiple strings
fourth = pd.read_csv('C:/infocentertracker.csv')
fourth = fourth.rename(columns={'Phone Number: ': 'Phone Number:'})
fourth['Source:'] = fourth['Source:'].replace('......', 'Drive-by')
fourth.to_csv(.............)
Drive By
Drive-By
Drive-by; Return Visitor
Drive/LTX.com/Internes Srch Replace all with Drive-by
Driving By/Lantana Website
Drive by
Driving By/Return Visitor
Drive by/Resident Referral
Driving by
Drive- by
Driving by/LTX Website
Driving By
Driving by/Return Visitor
Drive By/Return Visitor
Drive By/LTX Website
答案 0 :(得分:2)
您可以使用str.startswith
之前的布尔值掩码替换所有以Driv
开头的值,并且想法来自comment of Marat:
df.loc[df.col.str.startswith('Driv'), 'col'] = 'Drive-by'
样品:
print (fourth)
col
0 Drive By
1 Drive-By
2 Drive-by; Return Visitor
3 Drive/LTX.com/Internes Srch
4 Driving By/Lantana Website
5 Drive by
6 Driving By/Return Visitor
7 Drive by/Resident Referral
8 Driving by
9 Drive- by
10 Driving by/LTX Website
11 Driving By
12 Driving by/Return Visitor
13 Drive By/Return Visitor
14 Drive By/LTX Website
15 aaa
fourth.loc[fourth['Source:'].str.startswith('Driv'), 'Source:'] = 'Drive-by'
print (fourth)
Source:
0 Drive-by
1 Drive-by
2 Drive-by
3 Drive-by
4 Drive-by
5 Drive-by
6 Drive-by
7 Drive-by
8 Drive-by
9 Drive-by
10 Drive-by
11 Drive-by
12 Drive-by
13 Drive-by
14 Drive-by
15 aaa
Series.mask
的另一个解决方案:
fourth['Source:']=fourth['Source:'].mask(fourth['Source:'].str.startswith('Driv', na=False),
'Drive-by')
print (fourth)
Source:
0 Drive-by
1 Drive-by
2 Drive-by
3 Drive-by
4 Drive-by
5 Drive-by
6 Drive-by
7 Drive-by
8 Drive-by
9 Drive-by
10 Drive-by
11 Drive-by
12 Drive-by
13 Drive-by
14 Drive-by
15 aaa
答案 1 :(得分:1)
当您请求pandas方法时,有以下一个选项:
fourth.ix[fourth['column name with values'].str.contains('driv', case=False, na=False), 'column name with values'] = 'Drive-by'
我更喜欢使用不一定需要pandas的正则表达式:
import re
[re.sub('(Driv.+)', 'Drive-by', i) for i in fourth['column name']]
答案 2 :(得分:0)
您可以在Pandas中用单个值替换多个值(列表)
govt_alias = ['govt', 'govern']
df['installer'].str.replace('|'.join(govt_alias), 'government')
在您的特定情况下,其他答案更为理想,但我展示的方法是可概括的。