如果整个字符串包含pandas数据框中的子字符串,但替换为值列表,则替换整个字符串

时间:2020-02-04 17:11:34

标签: python pandas

下面的问题是否有办法做,但不是使用单个字符串值,而是使用dict / array在更少的代码行中替换许多值?

Replace whole string if it contains substring in pandas

到目前为止我所拥有的:

key = [
    {
        "substr": ["foo1", "foo2"],
        "new_val": "bar"
    },
]

for i in range(len(key)):
    df.loc[df[column].str.contains('|'.join(key[i]['substr'])), column] = key[i]['new_val']

可以改善吗?

1 个答案:

答案 0 :(得分:0)

尝试:

for el in key:
    df[column]=df[column].str.replace('.*('+ '|'.join(el["substr"]) +').*', el["new_val"], regex=True)

输出(虚拟数据):

import pandas as pd

key = [
    {
        "substr": ["foo1", "foo2"],
        "new_val": "bar"
    }
]

df=pd.DataFrame({"x": ["foo1xyz", "abcfoo", "zyc", "foyo2foo2g"], "y": [1,2,3,4]})

for el in key:
    df["x"]=df["x"].str.replace('.*('+ '|'.join(el["substr"]) +').*', el["new_val"], regex=True)

>> df

        x  y
0     bar  1
1  abcfoo  2
2     zyc  3
3     bar  4