R:映射两个列表的对象并返回数据帧列表

时间:2015-10-21 20:03:29

标签: r list loops mapply

我有两个GRange列表,我正在尝试将countOverlaps函数应用于列表的每个组合,并返回如下结果列表:

library(GenomicRanges)
gr1 <- GRanges(seqnames = c("chr1", "chr2"), ranges = IRanges(c(7,13), width = 3), strand = c("+", "-"))
gr2 <- GRanges(seqnames = c("chr1", "chr3"), ranges = IRanges(c(5,13), width = 3), strand = c("+", "-"))
grlA <- GRangesList("a" = gr1, "b" = gr2)

gr1 <- GRanges(seqnames = c("chr1", "chr2"), ranges = IRanges(c(1,13), width = 3), strand = c("+", "-"))
gr2 <- GRanges(seqnames = c("chr1", "chr3"), ranges = IRanges(c(3,13), width = 3), strand = c("+", "-"))
grlB <- GRangesList("c" = gr1, "d" = gr2)

我想在grlA中获取一个对象“a”和对象“b”的列表,其中包含grlB的每个值的函数结果:

(列出$ a,$ b和c,d的数据帧)

$ C

a b

$ d

a b

这可以获得列表的所有组合:

comb_apply <- function(f,..., MoreArgs=list()){
  exp <- unname(as.list(expand.grid(...,stringsAsFactors = FALSE)))
  do.call(mapply, c(list(FUN=f, SIMPLIFY=FALSE, MoreArgs=MoreArgs), exp))
 }

# This function is thanks to Michael Lawrence's help posted in the bioconductor package
t= comb_apply(function(i, j) countOverlaps(grlA[[i]], grlB[[j]]), seq_along(grlA), seq_along(grlB))
names(t)=apply(expand.grid(names(grlA), names(grlB)), 1, paste, collapse="_")

但是为了得到我想要的东西(数据框列表),我需要grep命令来选择属于grlB的数据框并将它们保存在一个单独的列表中,但这很慢......

new=list()
for (i in names(grlB)) {
df = as.data.frame(t[grep(i,names(t))])
new[[length(new)+1]] <- df
}

有没有另外一种方法可以在没有grep的情况下做到这一点? 谢谢!

1 个答案:

答案 0 :(得分:0)

此数据不应位于列表结构中,因为它具有可预测且一致的结构。我将它放入一个数据框中,并将其整形为您正在寻找的大致格式。

library(dplyr)
library(tidyr)

t %>%
  as.data.frame %>%
  mutate(ID = 1:n()) %>%
  gather(variable, value, -ID) %>%
  separate(variable, c("A", "B")) %>%
  spread(ID, value) %>%
  group_by(B) %>%
  do(result = my_function(.) )