根据R中的位置查找重叠范围

时间:2018-06-15 15:14:25

标签: r

我有两个数据集:

 chr1 25 85
 chr1 2000 3000
 chr2 345 2300

和第二,

chr1 34 45 1.2
chr1 100 1000
chr2 456 1500 1.3

这是我想要的输出,

chr1 25 85 1.2
chr2 345 2300 1.3

以下是我的代码:

sb <- NULL
rangesC <- NULL
sb$bin <- NULL
for(i in levels(df1$V1)){
   s <- subset(df1, df1$V1 == i)
   sb <- subset(df2, df2$V1 == i)
   for(j in 1:nrow(sb)){
     sb$bin[j] <-s$V4[(s$V2 <= sb$V2[j] & s$V3 >= sb$V3[j])]
  }
 rangesC <- try(rbind(rangesC, sb),silent = TRUE)
}

我得到的错误是:

replacement has length zero OR when I use as.character rangesC is empty.

如果位置重叠,我想让V4对应。出了什么问题?

1 个答案:

答案 0 :(得分:1)

foverlaps()包中的data.table函数执行两个data.tables的重叠连接

library(data.table)
setDT(df1, key = names(df1))
setDT(df2, key = key(df1))
foverlaps(df2, df1, nomatch = 0L)[, -c("i.V2", "i.V3")]
     V1  V2   V3  V4
1: chr1  25   85 1.2
2: chr2 345 2300 1.3

数据

library(data.table)
df1 <- fread(
  "chr1 25 85
 chr1 2000 3000
 chr2 345 2300", header = FALSE
)

df2 <- fread(
  "chr1 34 45 1.2
chr1 100 1000 
chr2 456 1500 1.3", header = FALSE
)