按组查找最接近的值

时间:2018-09-17 08:41:48

标签: r data.table

我正在寻找一种实用的方法来(最好使用)0为每个组检索最接近data.table的值。

假设以下DT

set.seed(1)
library(data.table)
DT <- data.table(val = rnorm(1000), group = rep(1:10, each = 10)) # 10 groups

我尝试将by = grouproll = "nearest"组合在一起,但是它只返回最近的值 cross ,而不返回 by 组:

DT[val == 0, val, by = group, roll = "nearest"]
#   group       value
#1:     8 0.001105352

我当然可以为每个小组重复该过程,但是随着小组数目的增加,这是不切实际的。例如:

res <- rbind(DT[val == 0 & group = 1, val, by = group, roll = "nearest"],
             DT[val == 0 & group = 2, val, by = group, roll = "nearest"],
             DT[val == 0 & group = 3, val, by = group, roll = "nearest"],
             ...)

也许我缺少一些data.table功能?

1 个答案:

答案 0 :(得分:3)

您不一定需要加入。

结合使用minabs的可能解决方案:

DT[, .(closest.val.to.zero = val[abs(val) == min(abs(val))]), by = group]

给出:

    group closest.val.to.zero
 1:     1         0.011292688
 2:     2        -0.016190263
 3:     3         0.002131860
 4:     4         0.004398704
 5:     5         0.017395620
 6:     6         0.002415809
 7:     7         0.004884450
 8:     8         0.001105352
 9:     9        -0.040150452
10:    10        -0.010925691

该选项的一种更通用的方式为posted by @chinsoon12 in the comments

DT[CJ(group = group, val = 0, unique = TRUE)
   , on = .(group, val)
   , .(group, closest.val.to.zero = x.val)
   , roll = "nearest"]