我需要用其他东西替换反斜杠并编写此代码来测试基本概念。工作正常:
test_string = str('19631 location android location you enter an area enable quick action honeywell singl\dzone thermostat environment control and monitoring')
print(test_string)
test_string = test_string.replace('singl\\dzone ','singl_dbl_zone ')
print(test_string)
19631 location android location you enter an area enable quick action honeywell singl\dzone thermostat environment control and monitoring
19631 location android location you enter an area enable quick action honeywell singl_dbl_zone thermostat environment control and monitoring
但是,我有一个充满了这些(重新配置的)字符串的pandas,当我尝试操作df时,它不起作用。
raw_corpus.loc[:,'constructed_recipe']=raw_corpus['constructed_recipe'].str.replace('singl\\dzone ','singl_dbl_zone ')
反斜杠仍然存在!
323096 you enter an area android location location environment control and monitoring honeywell singl\dzone thermostat enable quick action
答案 0 :(得分:2)
str.replace
和pd.Series.str.replace
之间存在差异。前者接受子串替换,后者接受正则表达式。
使用str.replace
,您需要传递原始字符串。
df['col'] = df['col'].str.replace(r'\\d', '_dbl_')
答案 1 :(得分:1)
我认为删除反斜杠本身会更容易:
In [165]: df
Out[165]:
constructed_recipe
0 singl\dzone
In [166]: df['constructed_recipe'] = df['constructed_recipe'].str.replace(r'\\', '')
In [167]: df
Out[167]:
constructed_recipe
0 singldzone