我需要更改dataframe列中特定项目的值,我使用了for循环手动进行此操作,有没有一种使用成语或.where的方法,效率更高?我认为下面的代码不是最好的方法...
# change the names of the countries as requested
for index, row in energy.iterrows(): #change the name of specific
countries
if energy.loc[index, ['Country']].str.contains('United States of
America').bool():
energy.loc[index, ['Country']] = 'United States'
print(energy.loc[index, ['Country']])
if energy.loc[index, ['Country']].str.contains('Republic of
Korea').bool():
energy.loc[index, ['Country']] = 'South Korea'
print(energy.loc[index, ['Country']])
if energy.loc[index, ['Country']].str.contains('United Kingdom of Great
Britain and Northern Ireland').bool():
energy.loc[index, ['Country']] = 'United Kingdom'
print(energy.loc[index, ['Country']])
if energy.loc[index, ['Country']].str.contains('China, Hong Kong
Special Administrative Region').bool():
energy.loc[index, ['Country']] = 'Hong Kong'
print(energy.loc[index, ['Country']])
答案 0 :(得分:0)
您可以使用np.where
energy['Country'] = np.where(energy['Country'] == 'United States of America', 'United States', energy['Country'] )
energy['Country'] = np.where(energy['Country'] == 'Republic of Korea', 'Korea', energy['Country'])
或者:
energy['Country'][energy['Country'] == 'United States of America'] = 'United States'
energy['Country'][energy['Country'] == 'Republic of Korea'] = 'Korea'
df:
Country
0 United States of America
1 Spain
2 Republic of Korea
3 France
输出:
Country
0 United States
1 Spain
2 Korea
3 France
答案 1 :(得分:0)
您可以使用映射声明一个字典,然后使用map
例如:
import pandas as pd
mapVal = {'United States of America': 'United States', 'Republic of Korea': 'South Korea', 'United Kingdom of Great Britain and Northern Ireland': 'United Kingdom', 'China': 'Hong Kong', 'Hong Kong Special Administrative Region': 'Hong Kong'} #Sample Mapping
df = pd.DataFrame({'Country': ['United States of America', 'Republic of Korea', 'United Kingdom of Great Britain and Northern Ireland', 'China', 'Hong Kong Special Administrative Region']})
df["newVal"] = df["Country"].map(mapVal) #df["Country"] = df["Country"].map(mapVal)
print(df)
输出:
Country newVal
0 United States of America United States
1 Republic of Korea South Korea
2 United Kingdom of Great Britain and Northern I... United Kingdom
3 China Hong Kong
4 Hong Kong Special Administrative Region Hong Kong
答案 2 :(得分:0)
您可以使用Pandas replace()
方法:
energy
Country
0 United States of America
1 Republic of Korea
2 United Kingdom of Great Britain and Northern I...
3 China, Hong Kong Special Administrative Region
energy.replace(rep_map)
Country
0 United States
1 South Korea
2 United Kingdom
3 Hong Kong
请注意,replace()
将替换数据帧中这些字符串的所有实例。
数据:
countries = ["United States of America",
"Republic of Korea",
"United Kingdom of Great Britain and Northern Ireland",
"China, Hong Kong Special Administrative Region"]
replacements = ["United States", "South Korea", "United Kingdom", "Hong Kong"]
rep_map = {k:v for k, v in zip(countries, replacements)}
energy = pd.DataFrame({"Country": countries})