我有一个如下所示的数据矩阵:
> taxmat = matrix(sample(letters, 70, replace = TRUE), nrow = 10, ncol = 7)
> rownames(taxmat) <- paste0("OTU", 1:nrow(taxmat))
> taxmat<-cbind(taxmat,c("Genus","Genus","Genus","Family","Family","Order","Genus","Species","Genus","Species"))
> colnames(taxmat) <- c("Domain", "Phylum", "Class", "Order", "Family", "Genus", "Species", "Lowest")
> taxmat
Domain Phylum Class Order Family Genus Species Lowest
OTU1 "h" "c" "q" "e" "q" "w" "v" "Genus"
OTU2 "f" "y" "q" "z" "p" "w" "v" "Genus"
OTU3 "w" "q" "i" "i" "z" "j" "f" "Genus"
OTU4 "c" "e" "f" "n" "z" "b" "d" "Family"
OTU5 "g" "w" "q" "k" "e" "x" "k" "Family"
OTU6 "x" "j" "l" "w" "z" "o" "q" "Order"
OTU7 "k" "s" "j" "y" "t" "a" "t" "Genus"
OTU8 "w" "u" "s" "w" "g" "y" "n" "Species"
OTU9 "t" "r" "t" "o" "i" "l" "z" "Genus"
OTU10 "x" "p" "j" "f" "k" "q" "w" "Species"
专栏&#34;最低&#34;告诉我最低排名我对该行的数据有信心。对于每一行,我想替换&#34;最低&#34;所指示的列之后的列中的值。 &#34;未知。&#34;
此示例的预期输出为:
Domain Phylum Class Order Family Genus Species Lowest
OTU1 "b" "b" "v" "v" "l" "n" "unknown" "Genus"
OTU2 "l" "m" "w" "b" "f" "y" "unknown" "Genus"
OTU3 "h" "w" "n" "y" "k" "f" "unknown" "Genus"
OTU4 "u" "m" "p" "n" "t" "unknown" "unknown" "Family"
OTU5 "o" "b" "q" "w" "a" "unknown" "unknown" "Family"
OTU6 "s" "j" "l" "d" "unknown""unknown" "unknown" "Order"
OTU7 "v" "y" "t" "p" "s" "v" "unknown" "Genus"
OTU8 "b" "r" "k" "d" "q" "c" "q" "Species"
OTU9 "k" "h" "b" "w" "h" "x" "unknown" "Genus"
OTU10 "o" "p" "b" "n" "k" "d" "q" "Species"
我可以使用
将所有索引替换为矢量idx<-lapply(tax$Lowest, grep, colnames(tax))
idx <- as.numeric(unlist(idx))+1
但我不确定如何更换这些值。谢谢你的帮助!
答案 0 :(得分:1)
我们可以在apply
的行中使用循环,并通过match
列names
创建一个逻辑索引,使用最后一个元素的列,即&#39;中的元素。最低&#39;将replace
行的值设置为&#39; unknown&#39;
t(apply(m1, 1, function(x) {
i1 <- match( x[8], names(x)[-8])+1
i1[i1>7] <- 0
i1 <- if(i1!=0) i1:7 else i1
c(replace(x[-8], i1, "unknown"), x[8])}))
# Domain Phylum Class Order Family Genus Species Lowest
#OTU1 "b" "b" "v" "v" "l" "n" "unknown" "Genus"
#OTU2 "l" "m" "w" "b" "f" "y" "unknown" "Genus"
#OTU3 "h" "w" "n" "y" "k" "f" "unknown" "Genus"
#OTU4 "u" "m" "p" "n" "t" "unknown" "unknown" "Family"
#OTU5 "o" "b" "q" "w" "a" "unknown" "unknown" "Family"
#OTU6 "s" "j" "l" "d" "unknown" "unknown" "unknown" "Order"
#OTU7 "v" "y" "t" "p" "s" "v" "unknown" "Genus"
#OTU8 "b" "r" "k" "d" "q" "c" "q" "Species"
#OTU9 "k" "h" "b" "w" "h" "x" "unknown" "Genus"
#OTU10 "o" "p" "b" "n" "k" "d" "q" "Species"
或另一种选择是根据match
列名称创建行/列索引,最后一列是&#39; m1&#39;和行的顺序,然后cbind
索引,并在&#39; m1&#39;中分配值。到了&#39;未知&#39;
lst <- Map(function(x, y) if(x >y) 0 else x:y, match(m1[,8], colnames(m1)[-8])+1, 7)
m1[cbind(rep(seq_len(nrow(m1)), lengths(lst)), unlist(lst))] <- "unknown"