如何使用字典替换熊猫系列中的多个子字符串?

时间:2019-03-02 12:16:05

标签: python pandas

我有一系列的熊猫弦。我想对每行 的多个子字符串进行多次替换,请参见:

testdf = pd.Series([
    'Mary went to school today',
    'John went to hospital today'
])
to_sub = {
    'Mary': 'Alice',
    'school': 'hospital',
    'today': 'yesterday',
    'tal': 'zzz',
}
testdf = testdf.replace(to_sub, regex=True)  # does not work (only replaces one instance per row)
print(testdf)

在上述情况下,所需的输出是:

Alice went to hospital yesterday.
John went to hospizzz yesterday.

请注意,第一行在字典中有三个替换项。

除了逐行执行(在for循环中)之外,我如何有效地执行此操作?

在其他问题中,我尝试了df.replace(...)个其他答案,但是仅替换了一个子字符串,结果类似于:Alice went to school today,其中schooltoday不是' t代替。

要注意的另一件事是,对于任何一行,替换应该一次全部进行。 (请参见第一行中的hospital不会在时间内替换为hospizzz,这可能是错误)。

2 个答案:

答案 0 :(得分:2)

您可以使用:

#Borrowed from an external website
def multipleReplace(text, wordDict):
    for key in wordDict:
        text = text.replace(key, wordDict[key])
    return text

print(testdf.apply(lambda x: multipleReplace(x,to_sub)))

0    Alice went to hospital yesterday
1     John went to hospital yesterday

编辑

使用字典中提到的以下注释:

to_sub = {
'Mary': 'Alice',
'school': 'hospital',
'today': 'yesterday',
'tal': 'zzz'
}

testdf.apply(lambda x: ' '.join([to_sub.get(i, i) for i in x.split()]))

输出:

0    Alice went to hospital yesterday
1     John went to hospital yesterday

答案 1 :(得分:0)

它在panadas 23.0版本中对我有用...

给出DataFrame:

>>> testdf
0      Mary went to school today
1    John went to hospital today
dtype: object

需要替换的值。

>>> replace_values = {'Mary': 'Alice', 'school': 'hospital', 'today': 'yesterday'}

结果:

>>> testdf.replace(replace_values, regex=True)
0    Alice went to hospital yesterday
1     John went to hospital yesterday
dtype: object

具有所需结果的另一个示例:

包括带有替换..的部分字符串('tal':'zzz')运算符。

>>> replace_values = {'Mary': 'Alice', 'school': 'hospital', 'today': 'yesterday', 'tal': 'zzz'}
>>> testdf.replace(replace_values, regex=True)
0    Alice went to hospizzz yesterday
1     John went to hospizzz yesterday
dtype: object