k=seq(10100,249250621,10)
a =data.frame(nrow=300000,ncol=5) #like this format:
chr1 100000851 + 2 100000925
chr1 100001273 + 3 100001347
..............................
1.现在我想计算:
对于每个a[i,5]
,搜索k[j]
可能会在该时间间隔内生成a[i,5]
(k[j]-75,k[j]+75)
然后合并new data.frame(),make a[i,6]=k[j]
2.我已经写了两个代码,但我不知道我错在哪里:
1)
b=function(x){
x1=a[which(a[,5]-(x-75)>0&a[,5]-(x+75)<0,]
x2=cbind(x1,x)
}
c=apply(k,1,function(x)a(x))
2)
for(i in 1:length(k)){
if(length(N1<-which(a[,5]-(k-75)>0&a[,5]-(k+75)<0))>0){
for(j in N1){
x1=cbind(k,a[j,])
x2=rbind(x2,x1)
}
}
}
但他们两个都错了。
任何提供建议的人都会非常感激!
答案 0 :(得分:0)
从评论中我得到的结论如下:
#sample "a" and "k"
a = data.frame(col1 = paste("chr", rep(1:2, 2), sep = ""), col2 = sample(4),
col3 = "+", col4 = 1:4, col5 = c(215, 502, 345, 1007))
a
k = c(200, 300, 400, 500, 800, 1000)
a[[6]] = sapply(a[[5]], function(x)
paste(k[x >= (k - 75) & x <= (k + 75)], collapse = ", "))
a
# col1 col2 col3 col4 col5 V6
#1 chr1 4 + 1 215 200
#2 chr2 2 + 2 502 500
#3 chr1 3 + 3 345 300, 400
#4 chr2 1 + 4 1007 1000