def replace_name(row):
if row['Country Name'] == 'Korea, Rep.':
row['Country Name'] = 'South Korea'
if row['Country Name'] == 'Iran, Islamic Rep.':
row['Country Name'] = 'Iran'
if row['Country Name'] == 'Hong Kong SAR, China':
row['Country Name'] = 'Hong Kong'
return row
GDP.apply(replace_name, axis = 1)
GDP是一个< pd.DataFrame'
在这个时候,当我想找到韩国'它不起作用时,这个名字仍然是“韩国,众议院”,
但如果我将代码中的最后一行更改为此
GDP = GDP.apply(replace_name, axis = 1)
它有效。
起初,我认为原因是'申请'功能不能改变GDP本身,但当我处理另一个数据帧时,它实际上是有效的。代码如下:
def change_name(row):
if row['Country'] == "Republic of Korea":
row['Country'] = 'South Korea'
if row['Country'] == 'United States of America':
row['Country'] = 'United States'
if row['Country'] == 'United Kingdom of Great Britain and Northern Ireland':
row['Country'] ='United Kingdom'
if row['Country'] == 'China, Hong Kong Special Administrative Region':
row['Country'] = 'Hong Kong'
return row
energy.apply(change_name, axis = 1)
能源也是一个“pd.dataframe'。。”
这次我搜索“美国”时,它的确有效。原始名称是“美利坚合众国”,因此它成功更改了名称。
能源和GDP之间的唯一区别是能源是从excel文件中读取的,而GDP是从CSV文件中读取的。那导致不同结果的原因是什么?
答案 0 :(得分:1)
我认为更好的是使用replace
:
d = {'Korea, Rep.':'South Korea', 'Iran, Islamic Rep.':'Iran',
'Hong Kong SAR, China':'Hong Kong'}
GDP['Country Name'] = GDP['Country Name'].replace(d, regex=True)
因为差异可能是数据中的一些空白,可能有帮助:
GDP['Country'] = GDP['Country'].str.strip()
样品:
GDP = pd.DataFrame({'Country Name':[' Korea, Rep. ','a','Iran, Islamic Rep.','United States of America','s','United Kingdom of Great Britain and Northern Ireland'],
'Country': ['s','Hong Kong SAR, China','United States of America','Hong Kong SAR, China','s','f']})
#print (GDP)
d = {'Korea, Rep.':'South Korea', 'Iran, Islamic Rep.':'Iran',
'United Kingdom of Great Britain and Northern Ireland':'United Kingdom',
'Hong Kong SAR, China':'Hong Kong', 'United States of America':'United States'}
#replace by columns
#GDP['Country Name'] = GDP['Country Name'].replace(d, regex=True)
#GDP['Country'] = GDP['Country'].replace(d, regex=True)
#replace multiple columns
GDP[['Country Name','Country']] = GDP[['Country Name','Country']].replace(d, regex=True)
print (GDP)
Country Country Name
0 s South Korea
1 Hong Kong a
2 United States Iran
3 Hong Kong United States
4 s s
5 f United Kingdom