根据范围

时间:2015-07-27 20:48:02

标签: r

我想匹配来自两个不同数据帧的两个不相等列的元素,如果它们在以下范围内:1到3(2 +/- 1)

我的数据框:

DAT1:

Number  status
10023   T
10324   F
12277   F
12888   T
12000   T

DAT2:

Number  status
10020   T
10002   F
12279   F
12888   T

必需的输出:

10023   10020   T
12277   12279   F

我的尝试(下方)不起作用:

diff <- 2
allow <- 1

NewData <- dat1$Number %in% (dat2$Number<=diff+allow | dat2$Number>=diff+allow)

帮助将不胜感激。

1 个答案:

答案 0 :(得分:6)

对于data.table::foverlaps来说,这看起来像是必须的情况。

工作流程是在两个数据集中创建startend列,而我们将在第二个数据集中创建范围。然后,我们key两个,只需运行foverlaps

library(data.table)
diff <- 2
allow <- 1
setDT(dat1)[, `:=`(start = Number, end = Number)]
setkey(dat1, status, start, end)
setDT(dat2)[, `:=`(start = Number - (diff + allow), end = Number + diff + allow)]
setkey(dat2, status, start, end)
foverlaps(dat2, dat1, nomatch = 0L)[, .(Numdf1 = Number, Numdf2 = i.Number, status)]

#    Numdf1 Numdf2 status
# 1:  12277  12279  FALSE
# 2:  10023  10020   TRUE
# 3:  12888  12888   TRUE ### <- I'm assuming you had an error in the desirred output