R计算数值向量列表中的出现次数

时间:2013-11-26 12:05:10

标签: r list

我有这个基因列表和匹配的基因组位置

$HES2
 [1] 6472685 6472960 6473183 6473556 6473721 6473880 6474011 6474016 6474021
[10] 6474026 6474031 6474036 6474595 6475487 6475525 6475625 6475843 6475847
[19] 6475851 6476040 6476416 6477257 6477498 6477543 6478033 6478181 6480077
[28] 6481025 6481370 6481426 6483450 6483762 6483892 6484023 6484664

$HGFAC
[1] 3444133 3444265 3445567 3448970 3449333 3450621

我想计算段bin中的出现次数,假设有100个碱基对,并报告该特定段中的基因和计数,让我向您显示虚拟数据

HES2  6474026-6474126  150
HES2  6475038-6475138  589

提前致谢

1 个答案:

答案 0 :(得分:3)

不确定我是否遗漏了什么,但这就是你所追求的目标吗?

HES2<-c("6472685" ,"6472960" ,"6473183" ,"6473556" ,"6473721" ,"6473880" ,"6474011", "6474016" ,"6474021","6474026" ,"6474031" ,"6474036" ,"6474595", "6475487", "6475525" ,"6475625", "6475843", "6475847","6475851", "6476040" ,"6476416", "6477257", "6477498", "6477543" ,"6478033" ,"6478181" ,"6480077","6481025" ,"6481370", "6481426" ,"6483450" ,"6483762" ,"6483892", "6484023" ,"6484664")

HES2<-as.numeric(HES2)
bin_w<-100
h=hist(x=HES2,breaks=bin_w,plot=FALSE)
counts<-data.frame(from=h$breaks,to=h$breaks+bin_w,count=c(h$count,0))
counts[which(counts$count>0),]

      from      to count
1  6472600 6472700     1
4  6472900 6473000     1
6  6473100 6473200     1
10 6473500 6473600     1
12 6473700 6473800     1
13 6473800 6473900     1