我有这样的Dataframe。
problem.head(30)
Out[25]:
Country
0 Sweden
1 Africa
2 Africa
3 Africa
4 Africa
5 Germany
6 Germany
7 Germany
8 Germany
9 UK
10 Germany
11 Germany
12 Germany
13 Germany
14 Sweden
15 Sweden
16 Africa
17 Africa
18 Africa
19 Africa
20 Africa
21 Africa
22 Africa
23 Africa
24 Africa
25 Africa
26 Pakistan
27 Pakistan
28 ZA
29 ZA
现在我想用大陆名称替换国家/地区名称。因此,国家/地区名称将替换为其大陆名称。
我所做的是,我创建了所有Continent数组(我的数据框中有,我有56个国家),
asia = ['Afghanistan', 'Bahrain', 'United Arab Emirates','Saudi Arabia', 'Kuwait', 'Qatar', 'Oman',
'Sultanate of Oman','Lebanon', 'Iraq', 'Yemen', 'Pakistan', 'Lebanon', 'Philippines', 'Jordan']
europe = ['Germany','Spain', 'France', 'Italy', 'Netherlands', 'Norway', 'Sweden','Czech Republic', 'Finland',
'Denmark', 'Czech Republic', 'Switzerland', 'UK', 'UK&I', 'Poland', 'Greece','Austria',
'Bulgaria', 'Hungary', 'Luxembourg', 'Romania' , 'Slovakia', 'Estonia', 'Slovenia','Portugal',
'Croatia', 'Lithuania', 'Latvia','Serbia', 'Estonia', 'ME', 'Iceland' ]
africa = ['Morocco', 'Tunisia', 'Africa', 'ZA', 'Kenya']
other = ['USA', 'Australia', 'Reunion', 'Faroe Islands']
现在尝试使用
替换dataframe['Continent'] = dataframe['Country'].replace(asia, 'Asia', regex=True)
其中亚洲是我的名单,亚洲是要替换的文本。但是没有用 它只适用于
dataframe['Continent'] = dataframe['Country'].replace(np.nan, 'Asia', regex=True)
所以,请帮助
答案 0 :(得分:1)
最好将您的国家/地区 - 大陆地图存储为字典而不是四个单独的列表。您可以按照以下方式执行此操作,从当前列表开始:
continents = {country: 'Asia' for country in asia}
continents.update({country: 'Europe' for country in europe})
continents.update({country: 'Africa' for country in africa})
continents.update({country: 'Other' for country in other})
然后,您可以使用Pandas map
功能将大陆映射到各个国家/地区:
dataframe['Continent'] = dataframe['Country'].map(continents)
答案 1 :(得分:1)
将apply
与自定义功能一起使用。
<强>演示:强>
import pandas as pd
asia = ['Afghanistan', 'Bahrain', 'United Arab Emirates','Saudi Arabia', 'Kuwait', 'Qatar', 'Oman',
'Sultanate of Oman','Lebanon', 'Iraq', 'Yemen', 'Pakistan', 'Lebanon', 'Philippines', 'Jordan']
europe = ['Germany','Spain', 'France', 'Italy', 'Netherlands', 'Norway', 'Sweden','Czech Republic', 'Finland',
'Denmark', 'Czech Republic', 'Switzerland', 'UK', 'UK&I', 'Poland', 'Greece','Austria',
'Bulgaria', 'Hungary', 'Luxembourg', 'Romania' , 'Slovakia', 'Estonia', 'Slovenia','Portugal',
'Croatia', 'Lithuania', 'Latvia','Serbia', 'Estonia', 'ME', 'Iceland' ]
africa = ['Morocco', 'Tunisia', 'Africa', 'ZA', 'Kenya']
other = ['USA', 'Australia', 'Reunion', 'Faroe Islands']
def GetConti(counry):
if counry in asia:
return "Asia"
elif counry in europe:
return "Europe"
elif counry in africa:
return "Africa"
else:
return "other"
df = pd.DataFrame({"Country": ["Sweden", "Africa", "Africa", "Germany", "Germany", "UK","Pakistan"]})
df['Continent'] = df['Country'].apply(lambda x: GetConti(x))
print(df)
<强>输出:强>
Country Continent
0 Sweden Europe
1 Africa Africa
2 Africa Africa
3 Germany Europe
4 Germany Europe
5 UK Europe
6 Pakistan Asia