我想在我的pandas replace函数中删除短语'Thank you'/'thank u'/'thanks!'... ect
的所有表单(小写/大写,简短形式......)。
我目前只是努力匹配哪些工作,但有更有效的方法来做到这一点吗?
df.text_col.replace(to_replace='Thank you',value='',inplace=True,regex=True)
df.text_col.replace(to_replace='thank you',value='',inplace=True,regex=True)
df.text_col.replace(to_replace='th(.+)u',value='',inplace=True,regex=True)
.
.
答案 0 :(得分:0)
我建议列举你想要摆脱的thank you
的所有案例:
thanks_to_delete = '|'.join(['thanks', 'thank you'])
然后使用以下单行代码进行不区分大小写的替换:
df.text_col.str.replace(thanks_to_delete, '', case=False)
测试:
df=pd.DataFrame({
'text_col': ['Thank you very much for your patience',
'I would just want to thank you for your patience',
'Thanks for your patience']
})
df.text_col.str.replace(thanks_to_delete, '', case=False)
0 very much for your patience
1 I would just want to for your patience
2 for your patience
Name: text_col, dtype: object