我有这个基因列表和匹配的基因组位置
$HES2
[1] 6472685 6472960 6473183 6473556 6473721 6473880 6474011 6474016 6474021
[10] 6474026 6474031 6474036 6474595 6475487 6475525 6475625 6475843 6475847
[19] 6475851 6476040 6476416 6477257 6477498 6477543 6478033 6478181 6480077
[28] 6481025 6481370 6481426 6483450 6483762 6483892 6484023 6484664
$HGFAC
[1] 3444133 3444265 3445567 3448970 3449333 3450621
我想计算段bin中的出现次数,假设有100个碱基对,并报告该特定段中的基因和计数,让我向您显示虚拟数据
HES2 6474026-6474126 150
HES2 6475038-6475138 589
提前致谢
答案 0 :(得分:3)
不确定我是否遗漏了什么,但这就是你所追求的目标吗?
HES2<-c("6472685" ,"6472960" ,"6473183" ,"6473556" ,"6473721" ,"6473880" ,"6474011", "6474016" ,"6474021","6474026" ,"6474031" ,"6474036" ,"6474595", "6475487", "6475525" ,"6475625", "6475843", "6475847","6475851", "6476040" ,"6476416", "6477257", "6477498", "6477543" ,"6478033" ,"6478181" ,"6480077","6481025" ,"6481370", "6481426" ,"6483450" ,"6483762" ,"6483892", "6484023" ,"6484664")
HES2<-as.numeric(HES2)
bin_w<-100
h=hist(x=HES2,breaks=bin_w,plot=FALSE)
counts<-data.frame(from=h$breaks,to=h$breaks+bin_w,count=c(h$count,0))
counts[which(counts$count>0),]
from to count
1 6472600 6472700 1
4 6472900 6473000 1
6 6473100 6473200 1
10 6473500 6473600 1
12 6473700 6473800 1
13 6473800 6473900 1