R data.table操作,具有多个组,单个data.table和外部函数,具有lapply

时间:2014-08-12 20:47:11

标签: r data.table

我正在尝试使用此问题并回答(R data.table using lapply on functions defined outside)以帮助我回答我之前提出的问题(operations (+, -, /, *) on unequal-sized data.table)。

我在下面提到了一个修改过的例子:

library(data.table)

constants <- data.table(252.164401, 3785.412, 453.59237)

input1a <- data.table(ID = c(37, 45, 900), a1 = c(1, 2, 3), a2 = c(43, 320, 
390), 
b1 = c(-0.94, 2.2, -1.223), b2 = c(2.32, 4.54, 7.21), c1 = c(1, 2, 3), 
c2 = c(-0.94, 2.2, -1.223))
setkey(input1a, ID)

dput(input1a)
structure(list(ID = c(37, 45, 900), a1 = c(1, 2, 3), a2 = c(43, 320, 390),
b1 = c(-0.94, 2.2, -1.223), b2 = c(2.32, 4.54, 7.21), c1 = c(1, 2, 3), 
c2 = c(-0.94, 2.2, -1.223)), .Names = c("ID", "a1", "a2", "b1", "b2", "c1", 
"c2"), row.names = c(NA, -3L), class = c("data.table", "data.frame"), 
.internal.selfref = <pointer: 0x39c3f38>, sorted = "ID")

# input1a
#     ID  a1  a2       b1     b2  c1      c2
# 1:  37  1   43   -0.940   2.32   1  -0.940
# 2:  45  2  320    2.200   4.54   2   2.200
# 3: 900  3  390   -1.223   7.21   3  -1.223

根据下面的错误信息,需要在接下来的两行中更改哪些内容,以便两个参数&#34; b&#34;和&#34; c&#34;可以找到?

fun <- function(a, b, c, wherea=parent.frame(),whereb=parent.frame(),
wherec=parent.frame()) {
return(get(a,wherea) - constants$constants[2] * (get(b, whereb) - 
get(c, wherec)))
}

input1a[, lapply(c('a1', 'a2', 'b1', 'b2', 'c1', 'c2'), fun, wherea=.SD,
whereb=.SD, wherec=.SD), by = key(input1a)]

# Error in get(b, whereb) : argument "b" is missing, with no default

这就是我想要的input1a:

# input1a
#     ID        V1            V2
# 1:  37   7344.699    -12297.44
# 2:  45  -755.0824    -8537.864
# 3: 900   15988.79    -31532.38

谢谢。


更新

基于eddi的回答和Biogeek的修正(常量$ V2而不是常量$ constants [2]),如果我使用的代码来解决上面的简化示例。

fun <- function(a, b, c) a - constants$V2 * (b - c)

input1a[, lapply(1:2, function(i) fun(get(paste0('a', i)),
                                  get(paste0('b', i)),
                                  get(paste0('c', i)))),
by = ID]

#     ID         V1           V2
# 1:  37  7344.6993   -12297.443
# 2:  45  -755.0824    -8537.864
# 3: 900 15988.7949   -31532.379

2 个答案:

答案 0 :(得分:2)

您的函数必须包含参数abc,但您只需传递a。因此错误。

我不明白为什么你在函数中执行get,我会这样做:

# whatever function
fun = function(a, b, c) a + b + c

# evaluate inside the data.table, *then* pass it to your function
input1a[, lapply(1:2, function(i) fun(get(paste0('a', i)),
                                      get(paste0('b', i)),
                                      get(paste0('c', i)))),
          by = ID]

如果您愿意,应该明白如何更改为在函数中执行get

答案 1 :(得分:1)

我会保持简单:

fun <- function(a1, a2, b1, b2, c1, c2) {
    res1 = (a1 - constants$V2 * (b1 - c1))
    res2 = (a2 - constants$V2 * (b2 - c2))
    return(list(res1, res2))
}
# no get or apply function

input1a[, fun(a1, a2, b1, b2, c1, c2), by=ID]

input1a
    ID         V1         V2
1:  37  7344.6993 -12297.443
2:  45  -755.0824  -8537.864
3: 900 15988.7949 -31532.379