如果我有 DataFrame:
df_d = {'Element':['customer full name','full name','name','religion','account number','lgbt','lgbt identity']}
df = pd.DataFrame(data=df_d)
df['Match'} = ''
我有字典:
d = {'name':'Contact', 'religio':'Behavioral', 'lgbt':'Identity'}
如果元素包含字典键,我如何用字典值填充 df['Match'] ?我可以让它填充完整匹配的列:
for i in range(len(df)):
if df['Element'][i] in d:
df['Match'][i] = d[df['Element'][i]]
但我无法让它用于部分元素匹配。抱歉,我的浏览器不允许我复制和粘贴单元格输出。谢谢!
答案 0 :(得分:1)
Series.str.extract
+ map
我们可以从给定映射字典的键构造一个regex模式,然后使用这个模式在regex模式中extract
捕获组然后map
捕获组与映射字典
df['Match'] = df['Element'].str.extract(fr"({'|'.join(d.keys())})", expand=False).map(d)
Element Match
0 customer full name Contact
1 full name Contact
2 name Contact
3 religion Behavioral
4 account number NaN
5 lgbt Identity
6 lgbt identity Identity