我有一个数据框,并具有以下列:
gene col1 col2 type
------------------------------
gene_1 a b 1
gene_2 aa bb 2
gene_3 a b 1
gene_4 aa bb 2
我想使用列“ col2”和“ col1”查找列“ type”。所以我需要基于“ col2”和“ col1”的分类。我应该如何在R中执行此操作?
非常感谢
答案 0 :(得分:3)
基于。在输出中,一个选项是从“ col1”和“ col2”列创建组索引。
library(dplyr)
df1 %>%
mutate(type = group_indices(., col1, col2))
#. gene col1 col2 type
#1 gene_1 a b 1
#2 gene_2 aa bb 2
#3 gene_3 a b 1
#4 gene_4 aa bb 2
如果有多个名称,则一种选择是将字符串列名称转换为sym
bols,然后求值(!!!
)
df1 %>%
mutate(type = group_indices(., !!! rlang::syms(names(.)[2:3])))
或者在data.table
library(data.table)
setDT(df1)[, type := .GRP, .(col1, col2)]
df1 <- structure(list(gene = c("gene_1", "gene_2", "gene_3", "gene_4"
), col1 = c("a", "aa", "a", "aa"), col2 = c("b", "bb", "b", "bb"
), type = c(1L, 2L, 1L, 2L)), class = "data.frame", row.names = c(NA,
-4L))