我试图在scale()
的多个列上应用data.table
函数以定义新列。我收到以下错误:
dt = data.table( id = rep( 1:10, each = 10 ),
A = rnorm( 100, 1, 2 ),
B = runif( 100, 0, 1 ),
C = rnorm( 100, 10, 20 ) )
cols_to_use = c( "A", "B", "C" )
cols_to_define = paste0( cols_to_use, "_std" )
# working
dt[ , ( cols_to_define ) := lapply( .SD, scale ), .SDcols = cols_to_use ]
# not working
dt[ , ( cols_to_define ) := lapply( .SD, scale ), by = id, .SDcols = cols_to_use ]
## Error in `[.data.table`(dt, , `:=`((cols_to_define), lapply(.SD, scale)), :
## All items in j=list(...) should be atomic vectors or lists.
## If you are trying something like j=list(.SD,newcol=mean(colA)) then
## use := by group instead (much quicker), or cbind or merge afterwards.
有什么想法为什么在删除by
操作时能起作用?
答案 0 :(得分:2)
问题是scale
的with输出,它是matrix
dim(scale(dt$A))
#[1] 100 1
因此,我们需要通过删除vector
属性将其更改为dim
。 as.vector
或c
都可以做到
dt[ , ( cols_to_define ) := lapply( .SD, function(x)
c(scale(x)) ), by = id, .SDcols = cols_to_use ]
当没有by
时,matrix
dim
属性将被删除,同时保留其他属性。