如何为空类别分配因子水平?

时间:2019-06-19 22:10:30

标签: r dataframe

我有一个理论上可以假设介于0到8之间的任何值的因素。我希望为这些值中的每一个指定级别。我选择鸟类为例。但是,某些类别是空的,这导致R仅将级别分配给下一个非空类别。在我的数据集中,这是一个问题,因为它会定期更新,以前的空类别可能不再是空的,但因子水平的分配却搞砸了。

有什么方法可以更具体地分配R中的级别?在SPSS中,可以分配值标签,而这不取决于实际使用的类别。

谢谢!

x <- factor(c(1,3,5,6,7,6,5,3,1,8,1,6,7))

#the levels are supposed to correspond to the following values:
#0="blackbird" 
#1="eagle"
#2="owl"
#3="sparrow" 
#4="vulture"
#5="falcon" 
#6="dove" 
#7="seagull"
#8="penguin"

levels(x) <- c("blackbird", "eagle", "owl", "sparrow", "vulture", "falcon", "dove", "seagull", "penguin")

#now the levels do not correspond to the intended birds

1 个答案:

答案 0 :(得分:0)

您可以使用levels来说明缺少的类别。 levels定义了x可能采用的值。

x = factor(c(1,3,5,6,7,6,5,3,1,8,1,6,7), levels = 0:8)
x
# [1] 1 3 5 6 7 6 5 3 1 8 1 6 7
#Levels: 0 1 2 3 4 5 6 7 8

如果您的值与其他一些值相对应,请将它们存储在命名向量中。

y = setNames(object = c("blackbird", "eagle", "owl", "sparrow", "vulture", "falcon", "dove", "seagull", "penguin"),
             nm = 0:8)
y
#          0           1           2           3           4           5           6           7           8 
#"blackbird"     "eagle"       "owl"   "sparrow"   "vulture"    "falcon"      "dove"   "seagull"   "penguin"

如果要从factor中获取另一个值,请使用向量名称子集。

y[x]
#        1         3         5         6         7         6         5         3         1         8         1         6         7 
#  "eagle" "sparrow"  "falcon"    "dove" "seagull"    "dove"  "falcon" "sparrow"   "eagle" "penguin"   "eagle"    "dove" "seagull"