如何在R中对因子水平进行分组

时间:2016-08-13 16:50:38

标签: r

我有一个带足球位置缩写的因子栏,大约17个独特的值,有220个观察值。我想只有三个因子级别,包含17个唯一值。

levels(nfldraft$Pos) <- list(Linemen = c("C","OG","OT","TE","DT","DE"), Small_Backs =  c("CB","WR","FS"), Big_Backs = c("FB","ILB","OLB","P","QB","RB","SS","WR"))

是我尝试的,打印nfldraft$Pos到控制台显示3个因子级别,但所有值都是Linemen或Small_Backs,所有其他值都是NA。我哪里错了?谢谢

3 个答案:

答案 0 :(得分:4)

我制作了一个包含所有缩写的示例字符向量:

my_example <- c("C","OG","OT","TE","DT","DE","CB","WR","FS", 
                "FB","ILB","OLB","P","QB","RB","SS","WR")
class(my_example)
  

[1]&#34;字符&#34;

然后我用他们的缩写替换了所需的级别(你也可以在这里使用gsub或许多不同的方法中的任何一种):

my_example[my_example %in% c("C","OG","OT","TE","DT","DE")] <- "Linemen"
my_example[my_example %in% c("CB","WR","FS")]               <- "Small Backs"
my_example[my_example %in% c("FB","ILB","OLB","P",
                             "QB","RB","SS","WR")]          <- "Big Backs"

然后我把它变成了一个因素:

my_example <- as.factor(my_example)
head(my_example)
[1] Linemen Linemen Linemen Linemen Linemen Linemen
Levels: Big Backs Linemen Small Backs
tail(my_example)
[1] Big Backs   Big Backs   Big Backs   Big Backs   Big Backs   Small Backs
Levels: Big Backs Linemen Small Backs
class(my_example)
  

[1]&#34; factor&#34;

答案 1 :(得分:1)

这是一个需要完全可复制的示例的好示例。实际上,OP的代码看起来看起来应该可以工作。从@ Hack-R的示例输入中获取:

my_example <- c("C","OG","OT","TE","DT","DE","CB","WR","FS", 
                "FB","ILB","OLB","P","QB","RB","SS","WR")

OP的原始代码按原样工作:

nfldraft = list(Pos = factor(my_example))
levels(nfldraft$Pos) <- list(
  Linemen = c("C","OG","OT","TE","DT","DE"), 
  Small_Backs =  c("CB","WR","FS"), 
  Big_Backs = c("FB","ILB","OLB","P","QB","RB","SS","WR")
)
table(nfldraft$Pos)
#     Linemen Small_Backs   Big_Backs 
#           6           2           9 

这与如何使用levels<-的文档完全一致:

levels(x) <- value

valuelevels(x)的有效值...对于factor方法,长度至少为x的层数的字符串向量,或指定如何重命名级别

所以看来OP的输入还有其他问题

答案 2 :(得分:0)

您还可以使用dplyr软件包中的 mapvalues()函数。

在您的示例中为:

Linemen_levels = c("C","OG","OT","TE","DT","DE")
Small_Backs_levels = c("CB","WR","FS")
Big_Backs_levels = c("FB","ILB","OLB","P","QB","RB","SS","WR")

nfldraft <- nfldraft %>% mutate(Pos=mapvalues(Pos, 
                 from = c(Linemen_levels, Small_Backs_levels, Big_Backs_levels),
                 to = c(rep('Linemen', length(Linemen_levels), rep('Small_Backs', length(Small_Backs_levels), rep('Big_Backs', length(Big_Backs_levels))))))