我确信有一个简单的答案,但我一直在搜索,我找不到任何相关内容。
我有一个数据框(sdata),其中一列名为“landcover
”这是一个分类变量,但截至目前,每个土地覆盖类型都由一个数字表示。
我想用文本替换landcover数字代码,并且已经找到了如何部分执行此操作:
sdata$landcover<- as.factor(sdata$landcover)
levels(sdata$landcover) <- gsub("1", "w.subboreal", levels(sdata$landcover))
levels(sdata$landcover) <- gsub("2", "PICO", levels(sdata$landcover))
levels(sdata$landcover) <- gsub("3", "ABLA.PIEN", levels(sdata$landcover))
levels(sdata$landcover) <- gsub("5", "dry.forest", levels(sdata$landcover))
levels(sdata$landcover) <- gsub("10", "shrubby", levels(sdata$landcover))
levels(sdata$landcover) <- gsub("11", "agriculture", levels(sdata$landcover))
levels(sdata$landcover) <- gsub("13", "disturbed", levels(sdata$landcover))
这适用于单位数字,但是,例如,数字13变为“w.subborealABLA.PIEN
”(即1和3的组合),数字10变为“w.subboreal0
”(组合1和0)。如何确保将两位数字视为一个数字,而不是要替换两个单独的单个数字?
谢谢!
答案 0 :(得分:7)
为什么不使用labels
中的factor
?
set.seed(1)
x <- sample(c(1, 2, 3, 5, 10, 11, 13), 20, TRUE)
x
# [1] 2 3 10 13 2 13 13 10 10 1 2 2 10 3 11 5 11 13 3 11
factor(x, levels = c(1, 2, 3, 5, 10, 11, 13),
labels = c("w.subboreal", "PICO", "ABLA.PIEN", "dry.forest",
"shrubby", "agriculture", "disturbed"))
# [1] PICO ABLA.PIEN shrubby disturbed PICO disturbed disturbed
# [8] shrubby shrubby w.subboreal PICO PICO shrubby ABLA.PIEN
# [15] agriculture dry.forest agriculture disturbed ABLA.PIEN agriculture
# Levels: w.subboreal PICO ABLA.PIEN dry.forest shrubby agriculture disturbed
答案 1 :(得分:0)
#create a factor based on num vector
v = rep(1:3,4)
v = factor(v) #this create a factor with level 1,2,3 and length 12
#replace the original level 1,2,3 to the name you wanted,suppose 1 represent "China", 2 represent "USA",3 "Spain"
level(v) = c("China","USA","Spain")
#if You just need a country vector instead of numbers or factors
v = as.character(v)