更改R中字符列的级别

时间:2014-12-23 03:29:44

标签: r

我有以下数据和代码,我试图通过将字符列转换为因子并更改级别来更改级别(从a,b,c到c,a,b)的级别顺序。但是,它也会更改值:

> mydf$new = c('a','b','a','c','b')
> mydf
   vnum1 vnum2 vch1 new
1:   0.6   0.7    B   a
2:  -1.4   0.5    E   b
3:   0.7   0.9    A   a
4:  -0.3   0.8    C   c
5:  -0.8   0.6    C   b
> 
> str(mydf)
Classes ‘data.table’ and 'data.frame':  5 obs. of  4 variables:
 $ vnum1: num  0.6 -1.4 0.7 -0.3 -0.8
 $ vnum2: num  0.7 0.5 0.9 0.8 0.6
 $ vch1 : Factor w/ 4 levels "A","B","C","E": 2 4 1 3 3
 $ new  : chr  "a" "b" "a" "c" ...
 - attr(*, ".internal.selfref")=<externalptr> 
> 
> mydf$new = as.factor(mydf$new)
> str(mydf$new)
 Factor w/ 3 levels "a","b","c": 1 2 1 3 2
> levels(mydf$new)= c('c','a','b')
> str(mydf$new)
 Factor w/ 3 levels "c","a","b": 1 2 1 3 2
> mydf
   vnum1 vnum2 vch1 new
1:   0.6   0.7    B   c
2:  -1.4   0.5    E   a
3:   0.7   0.9    A   c
4:  -0.3   0.8    C   b
5:  -0.8   0.6    C   a

整个专栏&#39; new&#39;已被改变。我怎么能正确地做到这一点?

3 个答案:

答案 0 :(得分:2)

你不能只改变那样的水平。您基本上只是重命名级别的标签,就像执行names(mydf)<-c("x","y")时更改data.frame的列名一样。你想要的是创建一个不同级别订单的新因素

mydf$new <- factor(mydf$new, levels=c('c','a','b'))

答案 1 :(得分:2)

我认为您可以使用数据表语法。从

开始
mydf
#    vnum1 vnum2 vch1 new
# 1:   0.6   0.7    B   a
# 2:  -1.4   0.5    E   b
# 3:   0.7   0.9    A   a
# 4:  -0.3   0.8    C   c
# 5:  -0.8   0.6    C   b

你可以做到

mydf[, new := factor(new, levels = c("c", "a", "b"))][]
#    vnum1 vnum2 vch1 new
# 1:   0.6   0.7    B   a
# 2:  -1.4   0.5    E   b
# 3:   0.7   0.9    A   a
# 4:  -0.3   0.8    C   c
# 5:  -0.8   0.6    C   b
str(mydf)
# Classes ‘data.table’ and 'data.frame':    5 obs. of  4 variables:
#  $ vnum1: num  0.6 -1.4 0.7 -0.3 -0.8
#  $ vnum2: num  0.7 0.5 0.9 0.8 0.6
#  $ vch1 : Factor w/ 4 levels "A","B","C","E": 2 4 1 3 3
#  $ new  : Factor w/ 3 levels "c","a","b": 2 3 2 1 3
#  - attr(*, ".internal.selfref")=<externalptr> 

答案 2 :(得分:1)

您还可以使用relevel在列表中首先创建特定级别。

> mydf<-data.frame("h"=c(1,2,3,4,5),"var1"=c(1.2,3,4,21,1),"new"=c('a','b','a','c','b'))
> mydf$new = as.factor(mydf$new)
#> mydf
#  h var1 new
#1 1  1.2   a
#2 2  3.0   b
#3 3  4.0   a
#4 4 21.0   c
#5 5  1.0   b
#> str(mydf$new)
# Factor w/ 3 levels "a","b","c": 1 2 1 3 2
#> levels(mydf$new)
#[1] "a" "b" "c"

>mydf$new <- relevel(mydf$new, "c")             #makes "c" the first level
#> levels(mydf$new)
#[1] "c" "a" "b"
#> str(mydf$new)
# Factor w/ 3 levels "c","a","b": 2 3 2 1 3
#> mydf
#  h var1 new
#1 1  1.2   a
#2 2  3.0   b
#3 3  4.0   a
#4 4 21.0   c
#5 5  1.0   b