我有一个理论上可以假设介于0到8之间的任何值的因素。我希望为这些值中的每一个指定级别。我选择鸟类为例。但是,某些类别是空的,这导致R仅将级别分配给下一个非空类别。在我的数据集中,这是一个问题,因为它会定期更新,以前的空类别可能不再是空的,但因子水平的分配却搞砸了。
有什么方法可以更具体地分配R中的级别?在SPSS中,可以分配值标签,而这不取决于实际使用的类别。
谢谢!
x <- factor(c(1,3,5,6,7,6,5,3,1,8,1,6,7))
#the levels are supposed to correspond to the following values:
#0="blackbird"
#1="eagle"
#2="owl"
#3="sparrow"
#4="vulture"
#5="falcon"
#6="dove"
#7="seagull"
#8="penguin"
levels(x) <- c("blackbird", "eagle", "owl", "sparrow", "vulture", "falcon", "dove", "seagull", "penguin")
#now the levels do not correspond to the intended birds
答案 0 :(得分:0)
您可以使用levels
来说明缺少的类别。 levels
定义了x
可能采用的值。
x = factor(c(1,3,5,6,7,6,5,3,1,8,1,6,7), levels = 0:8)
x
# [1] 1 3 5 6 7 6 5 3 1 8 1 6 7
#Levels: 0 1 2 3 4 5 6 7 8
如果您的值与其他一些值相对应,请将它们存储在命名向量中。
y = setNames(object = c("blackbird", "eagle", "owl", "sparrow", "vulture", "falcon", "dove", "seagull", "penguin"),
nm = 0:8)
y
# 0 1 2 3 4 5 6 7 8
#"blackbird" "eagle" "owl" "sparrow" "vulture" "falcon" "dove" "seagull" "penguin"
如果要从factor中获取另一个值,请使用向量名称子集。
y[x]
# 1 3 5 6 7 6 5 3 1 8 1 6 7
# "eagle" "sparrow" "falcon" "dove" "seagull" "dove" "falcon" "sparrow" "eagle" "penguin" "eagle" "dove" "seagull"