在熊猫中查找和替换循环

时间:2018-11-07 10:18:07

标签: pandas

我有一个数据框,其中包含许多需要更改的字符。

我可以逐行执行此操作,但是我不知道如何遍历这些字符以替换为新字符。

到目前为止,这是我的代码:

df_media_input['MediaChannel']=df_media_input['MediaChannel'].map(lambda x: str.replace(x, "Direct Mail","DM"))
df_media_input['MediaChannel']=df_media_input['MediaChannel'].map(lambda x: str.replace(x, "DR TV","DRTV"))
df_media_input['MediaChannel']=df_media_input['MediaChannel'].map(lambda x: str.replace(x, "Affilliates","Affiliates"))
df_media_input['MediaChannel']=df_media_input['MediaChannel'].map(lambda x: str.replace(x, "DRTV","TV"))
df_media_input['MediaChannel']=df_media_input['MediaChannel'].map(lambda x: str.replace(x, "All Time TV","TV"))
df_media_input['MediaChannel']=df_media_input['MediaChannel'].map(lambda x: str.replace(x, "Peak TV","TV"))
df_media_input['MediaChannel']=df_media_input['MediaChannel'].map(lambda x: str.replace(x, "Regional Press","Press"))
df_media_input['MediaChannel']=df_media_input['MediaChannel'].map(lambda x: str.replace(x, "National Press","Press"))

但我觉得这样应该可行:

dic= {Direct Mail:DM}


for i and j in dic:

df_media_input['MediaChannel']=df_media_input['MediaChannel'].map(lambda x: str.replace(x, i,j))

“我的直邮”在哪里 DM是j

3 个答案:

答案 0 :(得分:2)

由于需要迭代,因此您可以执行以下操作。

for i in range(len(df)):

    d = {"Direct Mail":"DM",
         "DR TV":"DRTV",
         "DRTV":"TV",
         "All Time TV":"TV",
         "Peak TV":"TV",
         "Regional Press":"Press",
         "National Press":"Press"
     }
    for x,y in d.items(): 
        df['MediaChannel'].values[i] = df['MediaChannel'].values[i].replace(x, y)

答案 1 :(得分:1)

首先创建要替换的字典:

d = {"Direct Mail":"DM", 
     "DR TV":"DRTV",
     ...}

如果要替换子字符串,请将replaceregex=True一起使用:

df_media_input['MediaChannel'] = df_media_input['MediaChannel'].replace(d, regex=True)

如果要更快地替换值,请使用mapfillna

df_media_input['MediaChannel'] = df_media_input['MediaChannel'].map(d)
                                     .fillna(df_media_input['MediaChannel'])

检查样品差异:

df_media_input = pd.DataFrame({'MediaChannel':['Direct Mail','DR TV new','val']})
print (df_media_input)
  MediaChannel
0  Direct Mail
1    DR TV new
2          val

d = {"Direct Mail":"DM", "DR TV":"DRTV"}


df_media_input['MediaChannel1'] = df_media_input['MediaChannel'].replace(d, regex=True)

df_media_input['MediaChannel2'] = (df_media_input['MediaChannel'].map(d)
                                     .fillna(df_media_input['MediaChannel']))
print (df_media_input)
  MediaChannel MediaChannel1 MediaChannel2
0  Direct Mail            DM            DM
1    DR TV new      DRTV new     DR TV new
2          val           val           val

答案 2 :(得分:1)

Pandas DataFrame replace方法接受一个字典,其中的键对应于现有的字符串,而值对应于用其替换的字符串。

在您的示例中:

replacements = {
    "Direct Mail": "DM",
    "DR TV": "DRTV",
    # and so on...
}

df_media_input['MediaChannel'].replace(replacements, inplace=True)

假定“ MediaChannel”列中的值仅是要替换的字符串,而不包含这些字符串。例如,"Direct Mail"将更改为"DM",但是"I like Direct Mail"将不会更改为"I like DM"。要使用子字符串处理这种情况,您需要将regex的{​​{1}}关键字参数设置为replace