Question

我有一个参考表，其中每行包含一个间隔（col1，col2）和2个其他值（颜色：“红色”，“蓝色”，方向：“ +”，“-”），例如下面的{ {1}}

interv

我还有一个感兴趣的表，该表的特定位置包含在第一个表的间隔中，另外还包括color和direction变量。

interv1 <- cbind(seq(from = 3, to = 40, by = 4),seq(from = 5, to = 50, by = 5), c(rep("blue",5), rep("red", 5)), rep("+",10))
interv2 <- cbind(seq(from = 3, to = 40, by = 4),seq(from = 5, to = 50, by = 5), c(rep("blue",5), rep("red", 5)), rep("-",10))
interv  <- rbind(interv1, interv2)

     [,1] [,2] [,3]   [,4]
[1,] "3"  "5"  "blue" "+" 
[2,] "7"  "10" "blue" "+" 
[3,] "11" "15" "blue" "+" 
[4,] "15" "20" "blue" "+" 
[5,] "19" "25" "blue" "+" 
[6,] "23" "30" "red"  "+"

我想做的是，当to_match <- cbind(rep(seq(from = 4, to = 43, by = 4),2), rep(c(rep("blue", 5), rep("red", 5)), 2), c(rep("-", 10), rep("+", 10))) [,1] [,2] [,3] [1,] "4" "blue" "-" [2,] "8" "blue" "-" [3,] "12" "blue" "-" [4,] "16" "blue" "-" [5,] "20" "blue" "-" [6,] "24" "red" "-"值具有相同的颜色和相同的方向时，将其关联到正确的间隔。想法是要有这样的东西：

to_match

或相反：

     [,1] [,2] [,3]   [,4] [5] 
[1,] "3"  "5"  "blue" "+"  "4"

我开始尝试使用[,1] [,2] [,3] [4] [5] [1,] "4" "blue" "-" "3" "6"函数，但是很快就变得一团糟...在我的真实数据集中，data.table::between()列的长度与to_match的长度不同（不确定如果相关）

Answer 1

非股权加入将在这里为您提供帮助。

创建示例数据

dt1 <- as.data.table( interv, stringsAsFactors = FALSE )
dt2 <- as.data.table( to_match, stringsAsFactors = FALSE )
dt1[, `:=`(V1 = as.numeric(V1), V2 = as.numeric(V2))]
dt2[, `:=`(V1 = as.numeric(V1))]

代码

对于所有间隔的比赛：

dt1[ dt2, .(x.V1, x.V2, x.V3, x.V4, i.V1), on = .(V1<=V1, V2>=V1, V3=V2, V4 = V3), allow.cartesian = TRUE][]

输出

#     x.V1 x.V2 x.V3 x.V4 i.V1
#  1:    3    5 blue    -    4
#  2:    7   10 blue    -    8
#  3:   11   15 blue    -   12
#  4:   15   20 blue    -   16
#  5:   15   20 blue    -   20
#  6:   19   25 blue    -   20
#  7:   23   30  red    -   24
#  8:   23   30  red    -   28
#  9:   27   35  red    -   28
# 10:   27   35  red    -   32
# 11:   31   40  red    -   32
# 12:   31   40  red    -   36
# 13:   35   45  red    -   36
# 14:   31   40  red    -   40
# 15:   35   45  red    -   40
# 16:   39   50  red    -   40
# 17:    3    5 blue    +    4
# 18:    7   10 blue    +    8
# 19:   11   15 blue    +   12
# 20:   15   20 blue    +   16
# 21:   15   20 blue    +   20
# 22:   19   25 blue    +   20
# 23:   23   30  red    +   24
# 24:   23   30  red    +   28
# 25:   27   35  red    +   28
# 26:   27   35  red    +   32
# 27:   31   40  red    +   32
# 28:   31   40  red    +   36
# 29:   35   45  red    +   36
# 30:   31   40  red    +   40
# 31:   35   45  red    +   40
# 32:   39   50  red    +   40
#     x.V1 x.V2 x.V3 x.V4 i.V1

在间隔表中查找包含值的行-需要匹配其他2个条件

1 个答案: