比较并获取行之间的间隔交叉点

时间:2017-04-07 09:26:34

标签: r dataframe

我有如下数据库。

pos1<-c(5,15,25,40,80,5,18,22,38,84,5,16,50,92,31,50,20,30,50,70,27,50,60,50,90,20,40)
pos2<-c(10,17,30,42,90,10,20,24,42,87,10,19,52,100,40,70,25,32,60,90,30,60,71,60,100,25,50)
chr<-c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2)
n<-c(25,65,78,56,35,78,58,98,14,25,65,85,98,74,20,36,48,98,52,69,21,47,53,10,12,37,82)
pop<-c("A","A","A","A","A","B","B","B","B","C","C","C","C","C","D","D","A","A","A","A","B","B","B","C","C","D","D")
data<-data.frame(pos1,pos2,chr,pop,n)

位置1和位置2设计了每个chr和种群的间隔的起点和终点。我的目的是获得弹出A,B和C(不是D)之间的哪个区间相交,以及每个区间的区间是唯一的。

因此,对于唯一的区间,我会得到一个结果data.frame,如下所示:

pos1.u<-c(25,50,92,20,30,27,90)
pos2.u<-c(30,52,100,25,32,30,100)
chr.u<-c(1,1,1,2,2,2,2)
pop.u<-c("A","B","C","A","A","B","C")
n.u<-c(78,98,74,48,98,21,12)
data.u<-data.frame(pos1.u,pos2.u,chr.u,pop.u,n.u)

对于在这3个群体之间相交的区间,数据框如下:

pos1.c<-c(5,15,40,80,5,38,85,5,16,50,70,50,60,50)
pos2.c<-c(10,17,42,90,10,42,87,10,19,60,90,60,71,60)
chr.c<-c(1,1,1,1,1,1,1,1,1,2,2,2,2,2)
pop.c<-c("A","A","A","A","B","B","B","C","C","A","A","B","B","C")
n.c<-c(25,65,56,35,78,14,25,65,85,52,69,47,53,10)
data.c<-data.frame(pos1.c,pos2.c,chr.c,pop.c,n.c)

我不知道怎么写一个完全正确的脚本,你能帮助我吗?

1 个答案:

答案 0 :(得分:1)

我认为以下代码可以满足您的要求,虽然它会产生不同的结果 - 所以请仔细检查!我认为差异在于开放和封闭间隔的定义。以下假设既没有包含端点,但我怀疑这可能不是你的意思(否则(15,18)和(17,19)不会计为重叠,因为没有整数值落在两者中) 。因此,您可能需要调整下面的打开/关闭定义。

props