使用data.table和lapply的变量中的列名

时间:2019-01-10 16:09:37

标签: r data.table lapply

我有类似的数据:

set.seed(1)
testing1 <- data.table(type=c("stock","stock","bond","bond"),a=rnorm(4),b=rnorm(4),c=rnorm(4),d=rnorm(4),e=rnorm(4))

    type          a          b          c           d           e
1: stock -0.6264538  0.3295078  0.5757814 -0.62124058 -0.01619026
2: stock  0.1836433 -0.8204684 -0.3053884 -2.21469989  0.94383621
3:  bond -0.8356286  0.4874291  1.5117812  1.12493092  0.82122120
4:  bond  1.5952808  0.7383247  0.3898432 -0.04493361  0.59390132

这恰好返回了我想要的:

result1 <- testing1[,c(list(type=type),lapply(.SD, `-`, a)), .SDcols = b:e]

    type          b          c           d          e
1: stock  0.9559616  1.2022352  0.00521323  0.6102635
2: stock -1.0041117 -0.4890317 -2.39834321  0.7601929
3:  bond  1.3230577  2.3474098  1.96055953  1.6568498
4:  bond -0.8569561 -1.2054376 -1.64021441 -1.0013795

问题是a列是动态命名的。我想做这样的事情:

cn <- "a"
result2 <- testing1[,c(list(type=type), lapply(.SD, `-`, get(cn))), .SDcols = b:e]

但是我收到错误消息:Error in FUN(X[[i]], ...) : non-numeric argument to binary operator

任何想法都将不胜感激。谢谢。

1 个答案:

答案 0 :(得分:1)

我们可以使用[[从'testing1'中提取列。这里的.SD无效,因为未在.SDcols中指定该列

testing1[,c(list(type=type),lapply(.SD, `-`, testing1[[cn]])), .SDcols = b:e]
#   type          b          c           d          e
#1: stock  0.9559616  1.2022352  0.00521323  0.6102635
#2: stock -1.0041117 -0.4890317 -2.39834321  0.7601929
#3:  bond  1.3230577  2.3474098  1.96055953  1.6568498
#4:  bond -0.8569561 -1.2054376 -1.64021441 -1.0013795

如果我们使用get,请确保环境与lapply相同,环境来自.SD,其中没有列'a'。而是使用Map

testing1[,  Map(`-`, .SD, list(get(cn))), .SDcols = b:e]