使用变量计算data.table中的列

时间:2015-12-30 11:35:08

标签: r data.table

我正在尝试计算data.table中的列,其中计算通过变量传递。以下内容与我要实现的内容相同:

dt <- data.table(mpg)
dt[, list(manufacturer, model, mpg_cyl_cty=cty/cyl, mpg_cyl_hwy=hwy/cyl)]

我希望mpg_cyl_cty=cty/cyl, mpg_cyl_hwy=hwy/cyl来自变量,如:

var <- c('mpg_cyl_cty=cty/cyl', 'mpg_cyl_hwy=hwy/cyl')
dt[, list(manufacturer, model, var)]

我想还有更多问题,因为应该通过list或c分配var类型(c或list)以及如何调用dt

希望有人有一个建议,因为我在WWW上找不到任何东西。

2 个答案:

答案 0 :(得分:1)

library(ggplot2)
library(data.table)

dt <- data.table(mpg)
# The original calculation
dt1 <- dt[, list(manufacturer, model, mpg_cyl_cty=cty/cyl, mpg_cyl_hwy=hwy/cyl)]

var <- c('mpg_cyl_cty=cty/cyl', 'mpg_cyl_hwy=hwy/cyl')
# create a string to pass for evaluation
expr <- paste0("`:=`(", paste0(var, collapse = ", "), ")")

dt2 <- dt[, 
          .(manufacturer, model, cty, cyl, hwy)
         ][, eval(parse(text = expr))        # evaluate the expression
         ][, c("cty", "cyl", "hwy") := NULL] # delete unnecessary columns

> print(all.equal(dt1, dt2))
[1] TRUE

答案 1 :(得分:0)

避免eval(parse(.))和操作语言对象的方法略有不同。
而不只是c('mpg_cyl_cty=cty/cyl', 'mpg_cyl_hwy=hwy/cyl')只需c("cty","hwy")输入。

library(data.table)
dt = as.data.table(ggplot2::mpg)
r.expected = dt[, list(manufacturer, model, mpg_cyl_cty=cty/cyl, mpg_cyl_hwy=hwy/cyl)]

cyl.ratio.j = function(var){
    substitute(lhs := rhs, list(
        lhs = as.name(paste0("mpg_cyl_", var)),
        rhs = call("/", as.name(var), as.name("cyl"))
    ))
}

r = dt[, eval(cyl.ratio.j("cty"))
       ][, eval(cyl.ratio.j("hwy"))
         ][, .SD, .SDcols = c("manufacturer", "model", paste0("mpg_cyl_", c("cty","hwy")))]

all.equal(r.expected, r)
#[1] TRUE