用数据框中的文本替换数字

时间:2014-02-04 16:18:32

标签: r replace gsub

我确信有一个简单的答案,但我一直在搜索,我找不到任何相关内容。

我有一个数据框(sdata),其中一列名为“landcover”这是一个分类变量,但截至目前,每个土地覆盖类型都由一个数字表示。

我想用文本替换landcover数字代码,并且已经找到了如何部分执行此操作:

sdata$landcover<- as.factor(sdata$landcover)
levels(sdata$landcover) <- gsub("1", "w.subboreal", levels(sdata$landcover))
levels(sdata$landcover) <- gsub("2", "PICO", levels(sdata$landcover))
levels(sdata$landcover) <- gsub("3", "ABLA.PIEN", levels(sdata$landcover))
levels(sdata$landcover) <- gsub("5", "dry.forest", levels(sdata$landcover))
levels(sdata$landcover) <- gsub("10", "shrubby", levels(sdata$landcover))
levels(sdata$landcover) <- gsub("11", "agriculture", levels(sdata$landcover))
levels(sdata$landcover) <- gsub("13", "disturbed", levels(sdata$landcover))

这适用于单位数字,但是,例如,数字13变为“w.subborealABLA.PIEN”(即1和3的组合),数字10变为“w.subboreal0”(组合1和0)。如何确保将两位数字视为一个数字,而不是要替换两个单独的单个数字? 谢谢!

2 个答案:

答案 0 :(得分:7)

为什么不使用labels中的factor

set.seed(1)
x <- sample(c(1, 2, 3, 5, 10, 11, 13), 20, TRUE)
x
#  [1]  2  3 10 13  2 13 13 10 10  1  2  2 10  3 11  5 11 13  3 11
factor(x, levels = c(1, 2, 3, 5, 10, 11, 13), 
       labels = c("w.subboreal", "PICO", "ABLA.PIEN", "dry.forest", 
                  "shrubby", "agriculture", "disturbed"))
#  [1] PICO        ABLA.PIEN   shrubby     disturbed   PICO        disturbed   disturbed  
#  [8] shrubby     shrubby     w.subboreal PICO        PICO        shrubby     ABLA.PIEN  
# [15] agriculture dry.forest  agriculture disturbed   ABLA.PIEN   agriculture
# Levels: w.subboreal PICO ABLA.PIEN dry.forest shrubby agriculture disturbed

答案 1 :(得分:0)

#create a factor based on num vector
v = rep(1:3,4)
v = factor(v) #this create a factor with level 1,2,3 and length 12
#replace the original level 1,2,3 to the name you wanted,suppose 1 represent "China", 2 represent "USA",3 "Spain"
level(v) = c("China","USA","Spain")
#if You just need a country vector instead of numbers or factors
v = as.character(v)