如何将data.frames中的data.frame与重叠间隔合并?
read.table(textConnection(
" from to Lith Form
1 0 1.2 GRN BCM
2 1.2 5.0 GDI BDI
"), header=TRUE)
read.table(textConnection(
" from to Weath Str
1 0 1.1 HW ES
2 1.1 2.9 SW VS
3 2.9 5.0 HW ST
"), header=TRUE)
from to Weath Str Lith Form
1 0.0 1.1 HW ES GRN BCM
2 1.1 1.2 SW VS GRN BCM
3 1.2 2.9 SW VS GDI BDI
4 2.9 5.0 HW ST GDI BDI
答案 0 :(得分:8)
这是一种方法。它与eddi(R cutting two data.frames based on intervals and merging)的答案类似,但您可以根据需要在data.frames中包含尽可能多的列。
# change your data to data.table
dt1 <- data.table(df1, key='from')
dt2 <- data.table(df2, key='from')
# skeleton for joined data.table
dt <- data.table(from=sort(unique(c(dt1[,from], dt2[,from]))),
to=sort(unique(c(dt1[,to], dt2[,to]))),
key='from')
# function to join skeleton with data.table
j1 <- function(dt, dt1){
dt3 <- dt1[dt, roll=TRUE]
dt3[,':='(to=to.1, to.1=NULL)]
setkey(dt3, from, to)
return(dt3)
}
# merge two data.tables
j1(dt, dt2)[j1(dt, dt1)]
在v1.9.3中,最近实现了重叠连接(或间隔连接)。有了这个,我认为您的任务可以完成如下(假设您的data.frames是df1
和df2
):
require(data.table) ## 1.9.3+
setDT(df1) ## convert to data.table without copy
setDT(df2)
setkey(df2, from, to)
ans = foverlaps(df1, df2, type="any")
ans = ans[, `:=`(from = pmax(from, i.from), to = pmin(to, i.to))]
ans = ans[, `:=`(i.from=NULL, i.to=NULL)][from <= to]
# from to Weath Str Lith Form
# 1: 0.0 1.1 HW ES GRN BCM
# 2: 1.1 1.2 SW VS GRN BCM
# 3: 1.2 2.9 SW VS GDI BDI
# 4: 2.9 5.0 HW ST GDI BDI