熊猫数据框替换不适用于子句

时间:2020-01-30 17:56:52

标签: python regex pandas dataframe replace

我有一个数据列,其中的A列包含以下形式的值:

Col A

this is to be replaced
nonsense, this is to be replaced
nonsense
garbage
this is to be replace, nonsense

理想的输出:

Col A

this has been replaced
nonsense, this has been replaced
nonsense
garbage
this has been replaced, nonsense

我尝试过:

df['Col A'].replace('this is to be replaced', 'this has been replaced')
df['Col A'].str.replace('this is to be replaced', 'this has been replaced', regex=True, inplace=True)
df['Col A'].replace({'this is to be replaced':'this has been replaced'}, regex=True, inplace=True)
df['Col A'].replace(regex= ['this is to be replaced'], value= 'this has been replaced')

基本上所有解决此问题的标准方法。问题似乎是子字符串中的空格。当我尝试替换特定的单词时,效果很好。

有什么想法吗?

编辑:我尝试了所有的建议,但它们不起作用。作为其他背景:

要替换的确切字符串是:

MATHEMATICS (Math 1601 & 1602)

MATHEMATICS (Math 1601 & Math 1602)

我也尝试过:

df['col A'] = df['col A'].replace('1602', 'Math 1602')

3 个答案:

答案 0 :(得分:0)

这是您要寻找的吗?

df = pd.DataFrame({'Col A':
['this is to be replaced',
'nonsense, this is to be replaced',
'nonsense',
'garbage',
'this is to be replace, nonsense']})
df.replace(to_replace=['is to be'], value = 'has been', regex = True, inplace = True)
df

enter image description here

答案 1 :(得分:0)

您可以简单地使用replace传递正确的参数:

data = {'index':[1,2,3,4,5],'Col A':['this is to be replaced','nonsense, this is to be replaced','nonsense','garbage','this is to be replaced']}
df = pd.DataFrame(data)
print(df)
df['Col A'].replace('is to be','has been',regex=True,inplace=True)
print(df)

输出:

  index                             Col A
0      1            this has been replaced
1      2  nonsense, this has been replaced
2      3                          nonsense
3      4                           garbage
4      5            this has been replaced

答案 2 :(得分:0)

问题是您缺少将结果设置为数据框的列的方法。

您可以使用以下代码实现目标:

import pandas as pd
l = ["this is to be replaced","nonsense, this is to be replaced","nonsense","garbage","this is to be replace, nonsense"]
df = pd.DataFrame(l,columns=["Col A"])
df["Col A"] = df["Col A"].str.replace("is to be","has been")

然后新的DataFrame如下:

>>> df
                              Col A
0            this has been replaced
1  nonsense, this has been replaced
2                          nonsense
3                           garbage
4   this has been replace, nonsense