如何选择R中的行范围

时间:2015-12-24 02:23:29

标签: r

我有一个名为mydf的数据框。我还有一个名为myvec <- c("chr5:11", "chr3:112", "chr22:334")的向量。我想要做的是,如果任何向量元素与mydf中的键匹配,则选择行的范围(包括上面的3个值和下面的3个值),并创建mydf的子集({{1} })。

因为在result中我们chr5:11与myvec中的键匹配,我们选择匹配chr5:8(下面三个值)到chr5:14(上面三个值)的行mydf

result

结果

 mydf<- structure(list(key = structure(c(5L, 2L, 7L, 8L, 4L, 1L, 6L, 
3L, 11L, 10L, 9L), .Names = c("34", "35", "36", "37", "38", "39", 
"40", "41", "42", "43", "44"), .Label = c("chr5:10", "chr5:11", 
"chr5:1123", "chr5:118", "chr5:12", "chr5:123", "chr5:13", "chr5:14", 
"chr5:19", "chr5:8", "chr5:9"), class = "factor"), variantId = structure(1:11, .Names = c("34", 
"35", "36", "37", "38", "39", "40", "41", "42", "43", "44"), .Label = c("9920068", 
"9920069", "9920070", "9920071", "9920072", "9920073", "9920074", 
"9920075", "9920076", "9920077", "9920078"), class = "factor")), .Names = c("key", 
"variantId"), row.names = c("34", "35", "36", "37", "38", "39", 
"40", "41", "42", "43", "44"), class = "data.frame")

2 个答案:

答案 0 :(得分:3)

以下内容(我使用$(document).ready(function () { $('#ShippingSubmit').on('click', function () { var pcShippingTabsLoaded = setInterval(function () { //if ($('#pcShippingTabs').length > 0) { if ($('#pcShippingTabs').is(':visible')) { window.clearInterval(pcShippingTabsLoaded); $('#pcShippingTabs li a').each(function (index) { // manipulate stuff here, like hide tabs }); } }); }); }); ,但var pcShippingTabsLoaded; $(document).on('click', '#ShippingSubmit', function() { pcShippingTabsLoaded = setInterval(function(){ if ($('#pcShippingTabs').is(':visible')) { window.clearInterval(pcShippingTabsLoaded); $('#pcShippingTabs li a').each(function (index) { // manipulate stuff here, like hide tabs }); } },100); }); 版本几乎相同)

data.table

(在base中,将library(data.table) mydf <- as.data.table(mydf) #(if mydf really is stored as a matrix currently) myvec2 <- lapply(strsplit(gsub("chr", "", myvec), split=":"), as.integer) mydf[unique(Reduce(c, sapply(myvec2, function(x){ which(key %in% paste0("chr", x[1], ":", seq((x2 <- x[2]) - 3L, x2 + 3L)))} ))), ] 替换为baseas.data.table替换为as.data.frame,并将结束方括号key替换为{{ 1}})

排序的额外选项

实际上,我认为这个选项通常更好,因为它首先以更柔韧的方式存储您的信息。这个版本在mydf$key用语中稍微重一点。

]

现在,稍微调整一下,继续前进:

,]

输出:

data.table

答案 1 :(得分:2)

您可以使用GenomicRanges包。

library(GenomicRanges)

myvec <- c("chr5:11", "chr3:112", "chr22:334")
myvec.gr <- GRanges(gsub(":.+", "", myvec), 
                    IRanges(as.numeric(gsub(".+:", "", myvec))-3,
                            as.numeric(gsub(".+:", "", myvec)))+3)

mydf.gr <- GRanges(gsub(":.+", "", mydf[,"key"]), 
                   IRanges(as.numeric(gsub(".+:", "", mydf[,"key"])),
                           as.numeric(gsub(".+:", "", mydf[,"key"]))))

d.v.op <- findOverlaps(mydf.gr, myvec.gr)

mydf[queryHits(d.v.op), ]
#    key       variantId
# 34 "chr5:12" "9920068"
# 35 "chr5:11" "9920069"
# 36 "chr5:13" "9920070"
# 37 "chr5:14" "9920071"
# 39 "chr5:10" "9920073"
# 42 "chr5:9"  "9920076"
# 43 "chr5:8"  "9920077"