在一行代码中用熊猫执行多次替换

时间:2018-10-21 17:59:28

标签: python pandas dataframe replace

我想按如下方式更改我的数据:

"Republic of Korea": "South Korea",
"United States of America": "United States",
"United Kingdom of Great Britain and Northern Ireland": "United Kingdom",
"China, Hong Kong Special Administrative Region": "Hong Kong"

当前,我正在使用代码:

energy.Country[energy.Country == "Republic of Korea"] = "South Korea"
energy.Country[energy.Country == "United States of America"] = "United States"
energy.Country[energy.Country == "United
               Kingdom of Great Britain and Northern Ireland"] = "United Kingdom"
energy.Country[energy.Country == "China, 
               Hong Kong Special Administrative Region"] ="Hong Kong"`

我尝试通过将参数作为字典传递来使用.replace方法来做到这一点:

energy.replace('Country' : {"Republic of Korea": "South Korea", "United States of America": "United States", "United Kingdom of Great Britain and Northern Ireland": "United Kingdom", "China, Hong Kong Special Administrative Region": "Hong Kong"})

但是它似乎不起作用,是否有更干净整洁的方式来做到这一点?

1 个答案:

答案 0 :(得分:4)

replace 上致电Series,这很容易。

repl_dict = {"Republic of Korea": "South Korea", ...}
energy['Country'] = energy['Country'].replace(repl_dict)

注意,这不是使用map的好地方,因为“国家”中未映射到repl_dict中任何内容的条目将被NaN替换。


另一个选择是基于列表压缩的替换:

energy['Country'] = [
    repl_dict.get(x, x) for x in energy['Country'].tolist()] 

不如replace简洁,但在性能方面绝对胜任。