我有一个数据框,其中一列代表国家/地区名称。我的目标是添加一个列,提供大陆信息。请检查以下用例:
my.df <- data.frame(country = c("Afghanistan","Algeria"))
是否有一个软件包可用于附加包含大陆名称的数据列,而不包含原始数据?
答案 0 :(得分:13)
您可以使用countrycode
包执行此任务。
library(countrycode)
df <- data.frame(country = c("Afghanistan",
"Algeria",
"USA",
"France",
"New Zealand",
"Fantasyland"))
df$continent <- countrycode(sourcevar = df[, "country"],
origin = "country.name",
destination = "continent")
#warning
#In countrycode(sourcevar = df[, "country"], origin = "country.name", :
# Some values were not matched unambiguously: Fantasyland
结果
df
# country continent
#1 Afghanistan Asia
#2 Algeria Africa
#3 USA Americas
#4 France Europe
#5 New Zealand Oceania
#6 Fantasyland <NA>
答案 1 :(得分:2)
你可以尝试
my.df <- data.frame(country = c("Afghanistan","Algeria"),
continent= as.factor(c("Asia","Africa")))
merge(my.df, raster::ccodes()[,c("NAME", "CONTINENT")], by.x="country", by.y="NAME", all.x=T)
# country continent CONTINENT
# 1 Afghanistan Asia Asia
# 2 Algeria Africa Africa
某些country
值可能需要调整;我不知道,因为你没有提供所有的价值。
答案 2 :(得分:0)
根据Markus的答案,countrycode
借鉴了codelist
的“大陆”声明。
?codelist
continent
的定义:
大陆:世界银行发展指标中定义的大陆
该问题询问大洲,但有时大洲没有提供足够的组来描绘数据。例如,continents
将北美和南美分为Americas
。
您可能想要的是region
:
地区:世界银行发展指标中定义的地区
尚不清楚世界银行如何对区域进行分组,但以下代码显示了该目的地的粒度。
library(countrycode)
egnations <- c("Afghanistan","Algeria","USA","France","New Zealand","Fantasyland")
countrycode(sourcevar = egnations, origin = "country.name",destination = "region")
输出:
[1] "Southern Asia"
[2] "Northern Africa"
[3] "Northern America"
[4] "Western Europe"
[5] "Australia and New Zealand"
[6] NA