无法使用data.table的一部分

时间:2013-07-04 19:10:18

标签: r split dataframe data.table

我的代码中存在一个非常烦人的问题:

library(data.table)
a<-(letters=c(1:20))
b<-rnorm(1:20)
c<-rnorm(1:20)
d<-rnorm(1:20)
final<-data.frame(a,b,c,d)

e<-data.table(final)
g<-e[, lapply(.SD, sum), by =c("a"), .SDcols = 2:4] #calculates a summary of columns for every "by" statement in my large dataframe
h<-g[,2:4]

向量h应包括g的2-4列,但它包含一个值为2:4的值。但是,在我的脚本中有一些行进一步用df [,columns]选择某些列。如何解决这个问题的任何想法都将非常感激。

1 个答案:

答案 0 :(得分:3)

Data Table FAQ中的第一个问题描述了这个问题:(关于为什么DT[,5]返回5

Because, by default, unlike a data.frame, the 2nd argument is an 
expression which is evaluated within the scope of DT. 5 evaluates to 5.

继续提供解决方法:

Having said this, there are some circumstances where referring to a column by
number is ok, such as a sequence of columns. In these situations just do:
DT[,5:10,with=FALSE] 

DT[,c(1,4,10),with=FALSE]