Question

我有一个理论上可以假设介于0到8之间的任何值的因素。我希望为这些值中的每一个指定级别。我选择鸟类为例。但是，某些类别是空的，这导致R仅将级别分配给下一个非空类别。在我的数据集中，这是一个问题，因为它会定期更新，以前的空类别可能不再是空的，但因子水平的分配却搞砸了。

有什么方法可以更具体地分配R中的级别？在SPSS中，可以分配值标签，而这不取决于实际使用的类别。

谢谢！

x <- factor(c(1,3,5,6,7,6,5,3,1,8,1,6,7))

#the levels are supposed to correspond to the following values:
#0="blackbird" 
#1="eagle"
#2="owl"
#3="sparrow" 
#4="vulture"
#5="falcon" 
#6="dove" 
#7="seagull"
#8="penguin"

levels(x) <- c("blackbird", "eagle", "owl", "sparrow", "vulture", "falcon", "dove", "seagull", "penguin")

#now the levels do not correspond to the intended birds

Answer 1

您可以使用levels来说明缺少的类别。 levels定义了x可能采用的值。

x = factor(c(1,3,5,6,7,6,5,3,1,8,1,6,7), levels = 0:8)
x
# [1] 1 3 5 6 7 6 5 3 1 8 1 6 7
#Levels: 0 1 2 3 4 5 6 7 8

如果您的值与其他一些值相对应，请将它们存储在命名向量中。

y = setNames(object = c("blackbird", "eagle", "owl", "sparrow", "vulture", "falcon", "dove", "seagull", "penguin"),
             nm = 0:8)
y
#          0           1           2           3           4           5           6           7           8 
#"blackbird"     "eagle"       "owl"   "sparrow"   "vulture"    "falcon"      "dove"   "seagull"   "penguin"

如果要从factor中获取另一个值，请使用向量名称子集。

y[x]
#        1         3         5         6         7         6         5         3         1         8         1         6         7 
#  "eagle" "sparrow"  "falcon"    "dove" "seagull"    "dove"  "falcon" "sparrow"   "eagle" "penguin"   "eagle"    "dove" "seagull"

如何为空类别分配因子水平？

1 个答案: