我正在尝试获得一个解决小问题的功能。我有两个列表,每个列表包含n个样本。每个样本具有可变数量的细菌标识符(在示例字母中,在我遇到的问题中,像OTU1-OTUn这样的细菌标识符,在两种情况下均为“字符”)。一个清单包括饮食中的样本,而另一个清单则是来自肠内容物的样本。我想了解每个肠道肠道样本中,饮食中有多少细菌来自肠道,而肠道中有多少细菌并非来自饮食。当在饮食中处理phyloseq对象和肠道都是noseoseq对象时,都容易做到这一点。
Bacteria_from_diet<-length(intersect(taxa_names(gut),taxa_names(diet))
Bacteria_not_diet<-length(taxa_names(diet)- Bacteria_from_diet
但是,这是“总结”了n个肠道和饮食样本的结果,就像是按样本对数据进行折叠一样,我需要某种程度的变化。 我已经在R中尝试了以下代码:
diet<-list(DL1=c("A","B","C"),DL2=c("A","C","D"),DL3=c("B","D","E"),DL4=c("B","D","E"))
gut<-list(DL5=c("A","F","G"),DL6=c("B","F","H"),DL7=c("D","H","J"),DL8=c("A","G","F"))
gut_vs_diet <- function(a,b) ## a is diet and b is gut
{
xx<-10
gut = numeric(xx)
diet = numeric(xx)
all<-unlist(lapply(b,length)) ### get the number of elements of each element of list b
for(i in seq_along(b)){ #### loop over b (gut) to get:
diet<-length(intersect(b[[i]],a[[i]])) ### the number of elements of diet are present in gut
gut = all-diet ## the number of elements of gut that not come from diet
}
gutvsdiet = data.frame(all,gut,diet)
return(gutvsdiet)
}
运行该功能时,我得到的结果不正确
gut_vs_diet(diet,gut)
all gut diet
DL5 3 3 0
DL6 3 3 0
DL7 3 3 0
DL8 3 3 0
在某些情况下,我可以在饮食栏获得一些价值,但是该功能会随机选择饮食样本。
我不知道哪里可能是错误。无论如何,我想迭代地做到这一点,我的意思是,将每个肠样本与所有饮食样本的值进行比较。另外,我可以运行replicate(10,gut_vs_diet(sample(diet),sample(gut))
进行随机比较,避免出现某种偏见。
非常感谢您的帮助
Manuel
答案 0 :(得分:2)
这是我的代码版本:
diet <- list(DL1=c("A","B","C"), DL2=c("A","C","D"), DL3=c("B","D","E"), DL4=c("B","D","E"))
gut <- list(DL5=c("A","F","G"), DL6=c("B","F","H"), DL7=c("D","H","J"), DL8=c("A","G","F"))
gut_vs_diet <- function(a, b) ## a is diet and b is gut
{
all <- lengths(b) ### get the number of elements of each element of list b
diet <- mapply(function(ai, bi) length(intersect(ai, bi)), a, b)
# diet <- lengths(mapply(intersect, a, b)) ## a variant
data.frame(all, gut=all-diet, diet)
}
gut_vs_diet(diet,gut)
# > gut_vs_diet(diet,gut)
# all gut diet
# DL5 3 2 1
# DL6 3 3 0
# DL7 3 2 1
# DL8 3 3 0
答案 1 :(得分:1)
正如@jogo在评论中建议的那样,您可以使用mapply
而不是for
循环:
FOO <- function(x, y){
all <- lengths(y)
diet <- mapply(function(a, b){
length(intersect(b, a))
}, x, y)
gut <- all - diet
return(data.frame(all, gut, diet))
}
> FOO(diet, gut)
all gut diet
DL5 3 2 1
DL6 3 3 0
DL7 3 2 1
DL8 3 3 0
答案 2 :(得分:0)
只是为了完成,使用for循环看起来像这样。请注意,您需要减去all [[i]]-节食并在循环内构造数据帧,否则,您只需将循环的最后结果填充它,即data.frame(all = c(3,3 ,3,3),肠道= 3,饮食= 0)
diet <- list(DL1 = c("A", "B", "C"), DL2 = c("A", "C", "D"), DL3 = c("B", "D", "E"), DL4 = c("B", "D", "E"))
gut <- list(DL5 = c("A", "F", "G"), DL6 = c("B", "F", "H"), DL7 = c("D", "H", "J"), DL8 = c("A", "G", "F"))
gut_vs_diet <- function(a, b)
{
all <- lengths(b)
gutvsdiet <- NULL
for (i in seq_along(b)) {
diet <- length(intersect(b[[i]], a[[i]]))
gut <- all[[i]] - diet
resultForThisListElement <- c(all[[i]], gut, diet)
gutvsdiet <- rbind(gutvsdiet, resultForThisListElement)
}
colnames(gutvsdiet) <- c("all", "gut", "diet")
return(gutvsdiet)
}
gut_vs_diet(diet, gut)