我有以下数据和代码,我试图通过将字符列转换为因子并更改级别来更改级别(从a,b,c到c,a,b)的级别顺序。但是,它也会更改值:
> mydf$new = c('a','b','a','c','b')
> mydf
vnum1 vnum2 vch1 new
1: 0.6 0.7 B a
2: -1.4 0.5 E b
3: 0.7 0.9 A a
4: -0.3 0.8 C c
5: -0.8 0.6 C b
>
> str(mydf)
Classes ‘data.table’ and 'data.frame': 5 obs. of 4 variables:
$ vnum1: num 0.6 -1.4 0.7 -0.3 -0.8
$ vnum2: num 0.7 0.5 0.9 0.8 0.6
$ vch1 : Factor w/ 4 levels "A","B","C","E": 2 4 1 3 3
$ new : chr "a" "b" "a" "c" ...
- attr(*, ".internal.selfref")=<externalptr>
>
> mydf$new = as.factor(mydf$new)
> str(mydf$new)
Factor w/ 3 levels "a","b","c": 1 2 1 3 2
> levels(mydf$new)= c('c','a','b')
> str(mydf$new)
Factor w/ 3 levels "c","a","b": 1 2 1 3 2
> mydf
vnum1 vnum2 vch1 new
1: 0.6 0.7 B c
2: -1.4 0.5 E a
3: 0.7 0.9 A c
4: -0.3 0.8 C b
5: -0.8 0.6 C a
整个专栏&#39; new&#39;已被改变。我怎么能正确地做到这一点?
答案 0 :(得分:2)
你不能只改变那样的水平。您基本上只是重命名级别的标签,就像执行names(mydf)<-c("x","y")
时更改data.frame的列名一样。你想要的是创建一个不同级别订单的新因素
mydf$new <- factor(mydf$new, levels=c('c','a','b'))
答案 1 :(得分:2)
我认为您可以使用数据表语法。从
开始mydf
# vnum1 vnum2 vch1 new
# 1: 0.6 0.7 B a
# 2: -1.4 0.5 E b
# 3: 0.7 0.9 A a
# 4: -0.3 0.8 C c
# 5: -0.8 0.6 C b
你可以做到
mydf[, new := factor(new, levels = c("c", "a", "b"))][]
# vnum1 vnum2 vch1 new
# 1: 0.6 0.7 B a
# 2: -1.4 0.5 E b
# 3: 0.7 0.9 A a
# 4: -0.3 0.8 C c
# 5: -0.8 0.6 C b
str(mydf)
# Classes ‘data.table’ and 'data.frame': 5 obs. of 4 variables:
# $ vnum1: num 0.6 -1.4 0.7 -0.3 -0.8
# $ vnum2: num 0.7 0.5 0.9 0.8 0.6
# $ vch1 : Factor w/ 4 levels "A","B","C","E": 2 4 1 3 3
# $ new : Factor w/ 3 levels "c","a","b": 2 3 2 1 3
# - attr(*, ".internal.selfref")=<externalptr>
答案 2 :(得分:1)
您还可以使用relevel
在列表中首先创建特定级别。
> mydf<-data.frame("h"=c(1,2,3,4,5),"var1"=c(1.2,3,4,21,1),"new"=c('a','b','a','c','b'))
> mydf$new = as.factor(mydf$new)
#> mydf
# h var1 new
#1 1 1.2 a
#2 2 3.0 b
#3 3 4.0 a
#4 4 21.0 c
#5 5 1.0 b
#> str(mydf$new)
# Factor w/ 3 levels "a","b","c": 1 2 1 3 2
#> levels(mydf$new)
#[1] "a" "b" "c"
>mydf$new <- relevel(mydf$new, "c") #makes "c" the first level
#> levels(mydf$new)
#[1] "c" "a" "b"
#> str(mydf$new)
# Factor w/ 3 levels "c","a","b": 2 3 2 1 3
#> mydf
# h var1 new
#1 1 1.2 a
#2 2 3.0 b
#3 3 4.0 a
#4 4 21.0 c
#5 5 1.0 b