我有这个带有变量V21的data.frame,其中记录了许多国家,我想通过仅指定大陆而不是所有这些国家来缩小它。例如'古巴','秘鲁','阿根廷',而不是V21的单独级别,我希望它们成为“南美洲”的水平。这是我尝试使用的代码:
recode(WaveOne.test$V21, "levels("Cuba","Colombia","Costa Rica","Argentina","Chile","Ecuador","Peru","Venezuela")= 'South America'")
你能说出我的代码有什么问题,或者可能是另一种方法吗? 我是R及其语法的完整新手。 谢谢!
====== UPDATE ======
SA_countries <- c("Cuba", "Mexico", "Argentina","Jamaica", "Haiti","West Indies", "Chile", "Ecuador", "Venezuela", "Other South America", "El Salvador", "Guatemala", "Nicaragua", "Dominican Republic", "Panama", "Costa Rica", "Peru")
Asia_countries&lt; - c(“菲律宾”,“越南”,“老挝”,“柬埔寨”,“苗族”,“其他亚洲”,“中国”,“香港”,“台湾”,“日本” ,“韩国”,“印度”,“巴基斯坦”) Europe_Canada&lt; - c(“欧洲/加拿大”) MiddleEast_Africa&lt; - c(“中东/非洲”)
continents <- list(`South America`= SA_countries, `Asia` = Asia_countries, `Europe_Canada` = Europe_Canada, `Middle East & Africa` = MiddleEast_Africa)
levels(WaveOne.test$V21) <- c(levels(WaveOne.test$V21), names(continents))
for(i in seq_along(continents)) WaveOne.test$V21[WaveOne.test$V21 %in% continents[[i]]] <- names(continents)[i]
levels(WaveOne.test$V21)
我的输出是:
水平(WaveOne.test $ V21)
1“古巴”“墨西哥”“尼加拉瓜”“哥伦比亚”“多米尼加共和国”“萨尔瓦多”“危地马拉”
[8]“洪都拉斯”“哥斯达黎加”“巴拿马”“阿根廷”“智利”“厄瓜多尔”“秘鲁”
[15]“委内瑞拉”“其他南美洲”“海地”“牙买加”“西印度群岛”“菲律宾”“越南”
[22]“老挝”“柬埔寨”“苗族”“其他亚洲”“中国”“香港”“台湾”
[29]“日本”“韩国”“印度”“巴基斯坦”“中东/非洲”“欧洲/加拿大”“南美洲”
[36]“亚洲”“欧洲_加拿大”“中东和非洲”
答案 0 :(得分:1)
您可以创建包含所有国家/地区和大陆的列表,然后相应地重新分配值:
continents <- list(`South America`=SA_countries,
`North America` = NA_countries,
Europe=Euro_countries)
levels(df$V21) <- c(levels(df$V21), names(continents)) #necessary to add new levels
for(i in seq_along(continents)) {
df$V21[df$V21 %in% continents[[i]]] <- names(continents)[i]}
可重复的示例
set.seed(123)
SA_countries <- c("Cuba","Colombia","Costa Rica","Argentina","Chile","Ecuador","Peru","Venezuela")
NA_countries <- c("Mexico", "USA", "Canada")
Euro_countries <- c("Germany", "France")
df <- data.frame(V21=sample(c(NA_countries,SA_countries, Europe),20,T))
df
# V21
# 1 Cuba
# 2 Venezuela
# 3 Costa Rica
# 4 Germany
# 5 France
# 6 Mexico
# 7 Argentina
# 8 Germany
# 9 Chile
# 10 Costa Rica
# 11 France
# 12 Costa Rica
# 13 Ecuador
# 14 Chile
# 15 USA
# 16 Germany
# 17 Cuba
# 18 Mexico
# 19 Colombia
# 20 France
continents <- list(`South America`=SA_countries, `North America` = NA_countries, Europe=Euro_countries)
levels(df$V21) <- c(levels(df$V21), names(continents))
for(i in seq_along(continents)) df$V21[df$V21 %in% continents[[i]]] <- names(continents)[i]
df
# V21
# 1 South America
# 2 South America
# 3 South America
# 4 Europe
# 5 Europe
# 6 North America
# 7 South America
# 8 Europe
# 9 South America
# 10 South America
# 11 Europe
# 12 South America
# 13 South America
# 14 South America
# 15 North America
# 16 Europe
# 17 South America
# 18 North America
# 19 South America
# 20 Europe