我有一个带足球位置缩写的因子栏,大约17个独特的值,有220个观察值。我想只有三个因子级别,包含17个唯一值。
levels(nfldraft$Pos) <- list(Linemen = c("C","OG","OT","TE","DT","DE"), Small_Backs = c("CB","WR","FS"), Big_Backs = c("FB","ILB","OLB","P","QB","RB","SS","WR"))
是我尝试的,打印nfldraft$Pos
到控制台显示3个因子级别,但所有值都是Linemen或Small_Backs,所有其他值都是NA。我哪里错了?谢谢
答案 0 :(得分:4)
我制作了一个包含所有缩写的示例字符向量:
my_example <- c("C","OG","OT","TE","DT","DE","CB","WR","FS",
"FB","ILB","OLB","P","QB","RB","SS","WR")
class(my_example)
[1]&#34;字符&#34;
然后我用他们的缩写替换了所需的级别(你也可以在这里使用gsub
或许多不同的方法中的任何一种):
my_example[my_example %in% c("C","OG","OT","TE","DT","DE")] <- "Linemen"
my_example[my_example %in% c("CB","WR","FS")] <- "Small Backs"
my_example[my_example %in% c("FB","ILB","OLB","P",
"QB","RB","SS","WR")] <- "Big Backs"
然后我把它变成了一个因素:
my_example <- as.factor(my_example)
head(my_example)
[1] Linemen Linemen Linemen Linemen Linemen Linemen Levels: Big Backs Linemen Small Backs
tail(my_example)
[1] Big Backs Big Backs Big Backs Big Backs Big Backs Small Backs Levels: Big Backs Linemen Small Backs
class(my_example)
[1]&#34; factor&#34;
答案 1 :(得分:1)
这是一个需要完全可复制的示例的好示例。实际上,OP的代码看起来看起来应该可以工作。从@ Hack-R的示例输入中获取:
my_example <- c("C","OG","OT","TE","DT","DE","CB","WR","FS",
"FB","ILB","OLB","P","QB","RB","SS","WR")
OP的原始代码按原样工作:
nfldraft = list(Pos = factor(my_example))
levels(nfldraft$Pos) <- list(
Linemen = c("C","OG","OT","TE","DT","DE"),
Small_Backs = c("CB","WR","FS"),
Big_Backs = c("FB","ILB","OLB","P","QB","RB","SS","WR")
)
table(nfldraft$Pos)
# Linemen Small_Backs Big_Backs
# 6 2 9
这与如何使用levels<-
的文档完全一致:
levels(x) <- value
value
是levels(x)
的有效值...对于factor方法,长度至少为x的层数的字符串向量,或指定如何重命名级别。
所以看来OP的输入还有其他问题
答案 2 :(得分:0)
您还可以使用dplyr软件包中的 mapvalues()函数。
在您的示例中为:
Linemen_levels = c("C","OG","OT","TE","DT","DE")
Small_Backs_levels = c("CB","WR","FS")
Big_Backs_levels = c("FB","ILB","OLB","P","QB","RB","SS","WR")
nfldraft <- nfldraft %>% mutate(Pos=mapvalues(Pos,
from = c(Linemen_levels, Small_Backs_levels, Big_Backs_levels),
to = c(rep('Linemen', length(Linemen_levels), rep('Small_Backs', length(Small_Backs_levels), rep('Big_Backs', length(Big_Backs_levels))))))