如何在R中的data.table中循环遍历列

时间:2017-08-14 16:40:32

标签: r dataframe data.table

整个周末这个问题困扰着我。我想根据另一列的组计算列中的' 1的数量。我想遍历许多列,我在这里给出的例子只是一个简单的例子。我认为DT1和DT2应该给我相同的结果,但显然DT1不起作用。谁能告诉我这是什么原因?谢谢!

DT <- data.table(Sample.name = c("A","B","C","A","A","B"),
                 Class1 = c(1, 0, 1, 0, 1, 0),
                 Class2 = c(1, 1, 1, 0, 1, 1))

round.test <- colnames(DT)
round.test <- round.test[c(2, 3)]
round.test <- noquote(round.test)


DT1 <- DT[, sum((round.test[1]) == 1),
                  by = Sample.name]

DT2 <- DT[, sum(Class1 == 1),
          by = Sample.name]

1 个答案:

答案 0 :(得分:1)

从列名称的字符向量开始,您可以使用get来获取列值:

round.test <- colnames(DT)
round.test <- round.test[c(2, 3)]

DT[, sum(get(round.test[1]) == 1), .(Sample.name)]

#   Sample.name V1
#1:           A  2
#2:           B  0
#3:           C  1

DT[, lapply(round.test, function(col) sum(get(col) == 1)), .(Sample.name)]

#   Sample.name V1 V2
#1:           A  2  2
#2:           B  0  2
#3:           C  1  1

或者您可以使用.SDcols传递列名,并通过.SD访问列:

DT[, lapply(.SD, function(col) sum(col == 1)), by=.(Sample.name), .SDcols=round.test]

#   Sample.name Class1 Class2
#1:           A      2      2
#2:           B      0      2
#3:           C      1      1