pandas在替换函数中添加正则表达式

时间:2016-04-06 18:52:07

标签: python regex pandas

我想在我的pandas replace函数中删除短语'Thank you'/'thank u'/'thanks!'... ect的所有表单(小写/大写,简短形式......)。

我目前只是努力匹配哪些工作,但有更有效的方法来做到这一点吗?

df.text_col.replace(to_replace='Thank you',value='',inplace=True,regex=True)
df.text_col.replace(to_replace='thank you',value='',inplace=True,regex=True)
df.text_col.replace(to_replace='th(.+)u',value='',inplace=True,regex=True)
                                   .
                                   .

1 个答案:

答案 0 :(得分:0)

我建议列举你想要摆脱的thank you的所有案例:

thanks_to_delete = '|'.join(['thanks', 'thank you'])

然后使用以下单行代码进行不区分大小写的替换:

df.text_col.str.replace(thanks_to_delete, '', case=False)

测试:

df=pd.DataFrame({
     'text_col': ['Thank you very much for your patience',
                  'I would just want to thank you for your patience',
                  'Thanks for your patience']
                })

df.text_col.str.replace(thanks_to_delete, '', case=False)
0                very much for your patience
1    I would just want to  for your patience
2                          for your patience
Name: text_col, dtype: object