在一个数据框中,我有一列用于说明不同国家/地区的名称,并且我想创建一个包含其地区的新列,例如该国家/地区是印度,该地区应该是亚洲等。我已经尝试使用np.where,但似乎我做错了什么。以下是我尝试过的代码:
Region = np.where(country_name == 'US' , "US",
np.where(country_name == ('Brazil' or 'Canada' or 'Peru' or 'Chile') , "Rest of America",
np.where(country_name == ('South Africa 'or 'Egypt' or 'Morocco' or 'Algeria' or 'Ghana'), "Africa",
np.where(country_name == ('Afghanistan'or 'Armenia'or 'Azerbaijan' or 'Bahrain'or'Bangladesh'or 'Bhutan'or
'Brunei'or 'Burma'or 'Cambodia'or 'China'or 'East Timor' or
'Georgia'or 'Hong Kong'or 'India' or 'Indonesia'or 'Iran' or 'Iraq'or 'Israel'or 'Japan'or
'Jordan'or 'Kazakhstan'or 'Kuwait'or 'Kyrgyzstan'or 'Laos'or
'Lebanon'or 'Malaysia' or 'Mongolia'or 'Nepal'or 'North Korea'or 'Oman'or 'Pakistan'|
'Papua New Guinea'or 'Philippines'or 'Qatar'or 'Russia'or 'Saudi Arabia'or 'Singapore'|
'South Korea'or 'Sri Lanka'or 'Syria'or 'Taiwan'or 'Tajikistan'or 'Thailand'or 'Turkey'or 'Turkmenistan'or
'United Arab Emirates'or 'Uzbekistan'or 'Vietnam'or 'Yemen'), "Asia",
np.where(country_name == ('Spain'or 'Italy' or 'Germany'or 'United Kingdom' or'France'), "Europe", "Unchange")))))
Below is the data:
Entity Region Code Date Total confirmed deaths (deaths) Total confirmed cases (cases)
0 Afghanistan Asia AFG 2019-12-31 0 0
1 Afghanistan Asia AFG 2020-01-01 0 0
2 Afghanistan Asia AFG 2020-01-02 0 0
3 Afghanistan Asia AFG 2020-01-03 0 0
4 Afghanistan Asia AFG 2020-01-04 0 0
5 Afghanistan Asia AFG 2020-01-05 0 0
6 Afghanistan Asia AFG 2020-01-06 0 0
7 Afghanistan Asia AFG 2020-01-07 0 0
8 Afghanistan Asia AFG 2020-01-08 0 0
9 Afghanistan Asia AFG 2020-01-09 0 0
10 Afghanistan Asia AFG 2020-01-10 0 0
11 Afghanistan Asia AFG 2020-01-11 0 0
但是此代码仅在第一个国家/地区有效,例如仅在巴西,南非,阿富汗和西班牙。
答案 0 :(得分:0)
list_1 = ["Iceland", "Norway", "Sweden", "Finland","Denmark","United Kingdom", "Ireland",
"France", "Belgium","Netherlands", "Luxembourg","Monaco", "Portugal", "Spain",
"Andorra", "Italy","Malta","San Marino", "Vatican City", "Germany",
"Switzerland", "Liechtenstein"," Austria", "Poland", "Czech Republic", "Slovakia",
"Hungary","Slovenia","Croatia", "Bosnia" ,"Herzegovina", "Serbia", "Montenegro",
"Albania", "Macedonia", "Romania", "Bulgaria","Greece", "Estonia", "Latvia",
"Lithuania", "Belarus", "Ukraine", "Moldova"]
list_2 = ['Brazil' , 'Canada' , 'Peru' , 'Chile', 'South America']
list_3 = ['Afghanistan', 'Armenia', 'Azerbaijan', 'Bahrain' ,'Bangladesh', 'Bhutan',
'Brunei', 'Burma', 'Cambodia', 'China', 'East Timor','Georgia', 'Hong Kong',
'India' , 'Indonesia', 'Iran' , 'Iraq' ,'Israel' , 'Japan','Jordan', 'Kazakhstan',
'Kuwait' , 'Kyrgyzstan' , 'Laos', 'Lebanon', 'Malaysia' , 'Mongolia', 'Nepal',
'North Korea', 'Oman', 'Pakistan','Papua New Guinea', 'Philippines', 'Qatar',
'Saudi Arabia','Singapore', 'South Korea', 'Sri Lanka', 'Syria', 'Taiwan'
'Tajikistan', 'Thailand', 'Turkey', 'Turkmenistan','United Arab Emirates',
'Uzbekistan', 'Vietnam', 'Yemen']
list_4 = ['United States']
list_5 = ['South Africa','Egypt' , 'Morocco' , 'Algeria' , 'Ghana', 'Africa', "Egypt"]
conditions = [
(df['Entity'].isin(list_4)),
(df['Entity'].isin(list_2)),
(df['Entity'].isin(list_5)),
(df['Entity'].isin(list_3)),
(df['Entity'].isin(list_1))
]
choices = ['US',"Rest of America","Africa","Asia","Europe"]
Region = np.select(conditions, choices, default='Rest of the world')