我有一个名为prueba
的数据框,其中我有一个因子变量ccaa
,其中包含以下级别:
x <- c("","Andalucia","Aragon","Asturias","Balears","Canarias","Cantabria",
"Castilla Leon","Castilla Mancha","Catalu<f1>a","Ceuta","Comunitat Valenciana",
"Extremadura","Galicia","Madrid","Melilla","Murcia","Navarra","Pa<ed>s Vasco",
"Rioja")
prueba <- data.frame(cca = x)
levels(prueba$cca)
# [1] "" "Andalucia" "Aragon" "Asturias"
# [5] "Balears" "Canarias" "Cantabria" "Castilla Leon"
# [9] "Castilla Mancha" "Catalu<f1>a" "Ceuta" "Comunitat Valenciana"
# [13] "Extremadura" "Galicia" "Madrid" "Melilla"
# [17] "Murcia" "Navarra" "Pa<ed>s Vasco" "Rioja"
我正在尝试重命名包含"Catalu<f1>a"
和"Pa<ed>s Vasco"
等符号的级别。我尝试了几个不成功的选项:
选项1:使用函数plyr
revalue
prueba$ccaa = revalue(prueba$ccaa,c( "Pa<ed>s Vasco" = "Basque Country", "Catalu<f1>a" = "Catalonia"))
它会产生以下错误:
The following `from` values were not present in `x`: Pa<ed>s Vasco, Catalu<f1>a
选项2:
levels(prueba$cca)[levels(prueba$cca)=="Catalu<f1>a"] <- "Catalonia"
levels(prueba$cca)[levels(prueba$cca)=="Pa<ed>s Vasco"] <- "Basque Country"
这样可以正常工作,但不会使用新标签重命名级别
levels(prueba$ccaa)
# [1] "" "Andalucia" "Aragon"
# [4] "Asturias" "Balears" "Canarias"
# [7] "Cantabria" "Castilla Leon" "Castilla Mancha"
# [10] "Catalu<f1>a" "Ceuta" "Comunitat Valenciana"
# [13] "Extremadura" "Galicia" "Madrid"
# [16] "Melilla" "Murcia" "Navarra"
# [19] "Pa<ed>s Vasco" "Rioja"
我不明白为什么关卡没有得到正确的标签。关于可能发生的事情的任何建议?
答案 0 :(得分:1)
我们可以使用recode
car
library(car)
prueba$cca <- recode(prueba$cca, "'Pa<ed>s Vasco'='Basque Country';'Catalu<f1>a' = 'Catalonia'")
levels(prueba$cca)
#[1] "" "Andalucia" "Aragon" "Asturias" "Balears" "Basque Country" "Canarias"
#[8] "Cantabria" "Castilla Leon" "Castilla Mancha" "Catalonia" "Ceuta" "Comunitat Valenciana" "Extremadura"
#[15] "Galicia" "Madrid" "Melilla" "Murcia" "Navarra" "Rioja"