如何使用mutate()获得因子的整数值

时间:2018-06-27 05:44:52

标签: r dplyr mutate levels

我正在尝试使用dplyr::mutate()返回每一行的因子列级别的整数值。这是我到目前为止的内容:

a <- tibble(group = factor(c(rep(c('group1', 'groupA', 'groupB'), 4), rep('groupC', 3))),
            name1 = factor(c(rep(c('gene1', 'gene2', 'geneA', 'geneB'),3), 
                             c('gene1', 'gene2', 'geneA'))),
            name2 = factor(c(rep(c('geneB', 'geneA', 'gene2', 'gene1'), 3),
                             c('geneB', 'geneA', 'gene2')))) %>%
  arrange(group)

a <- group_by(a, group) %>%
  mutate(n = row_number(),
         n_max = max(n),
         lev1 = which(levels(a$name1) == name1))

会导致错误消息:

Error in mutate_impl(.data, dots) : 
  Column `lev1` must be length 4 (the group size) or one, not 2

但是如果我只运行which(levels(a$name1) == 'gene2'),我将得到所需的值2

是什么原因导致此错误?我该如何解决?

1 个答案:

答案 0 :(得分:2)

你还在吗?

group_by(a, group) %>%
    mutate(
        n = row_number(),
        n_max = max(n),
        lev1 = as.numeric(name1))
## A tibble: 15 x 6
## Groups:   group [4]
#   group  name1 name2     n n_max  lev1
#   <fct>  <fct> <fct> <int> <dbl> <dbl>
# 1 group1 gene1 geneB     1    4.    1.
# 2 group1 geneB gene1     2    4.    4.
# 3 group1 geneA gene2     3    4.    3.
# 4 group1 gene2 geneA     4    4.    2.
# 5 groupA gene2 geneA     1    4.    2.
# 6 groupA gene1 geneB     2    4.    1.
# 7 groupA geneB gene1     3    4.    4.
# 8 groupA geneA gene2     4    4.    3.
# 9 groupB geneA gene2     1    4.    3.
#10 groupB gene2 geneA     2    4.    2.
#11 groupB gene1 geneB     3    4.    1.
#12 groupB geneB gene1     4    4.    4.
#13 groupC gene1 geneB     1    3.    1.
#14 groupC gene2 geneA     2    3.    2.
#15 groupC geneA gene2     3    3.    3.

name1已经是factor,因此as.numeric返回其因子级别索引。