我正在R中进行一些生物地理分析,结果被编码为一对矩阵。列表示地理区域,行表示系统发育树中的节点,矩阵中的值是分支事件发生在列所指示的地理区域中的概率。一个非常简单的例子是:
> One_node<-matrix(c(0,0.8,0.2,0),
+ nrow=1, ncol=4,
+ dimnames = list(c("node 1"),
+ c("A","B","C","D")))
> One_node
A B C D
node_1 0 0.8 0.2 0
在这种情况下,node_1最可能的位置是区域B.实际上,分析的输出被编码为两个独立的79x123矩阵。第一个是在事件之前占用给定区域的节点的概率,第二个是在事件之后占据给定区域的节点的概率(rowSums = 1)。一些稍微复杂的例子:
before<-matrix(c(0,0,0,0,0.9,
0.8,0.2,0.6,0.4,0.07,
0.2,0.8,0.4,0.6,0.03,
0,0,0,0,0),
nrow=5, ncol=4,
dimnames = list(c("node_1","node_2","node_3","node_4","node_5"),
c("A","B","C","D")))
after<-matrix(c(0,0,0,0,0.9,
0.2,0.8,0.4,0.6,0.03,
0.8,0.2,0.6,0.4,0.07,
0,0,0,0,0),
nrow=5, ncol=4,
dimnames = list(c("node_1","node_2","node_3","node_4","node_5"),
c("A","B","C","D")))
> before
A B C D
node_1 0.0 0.80 0.20 0
node_2 0.0 0.20 0.80 0
node_3 0.0 0.60 0.40 0
node_4 0.0 0.40 0.60 0
node_5 0.9 0.07 0.03 0
> after
A B C D
node_1 0.0 0.20 0.80 0
node_2 0.0 0.80 0.20 0
node_3 0.0 0.40 0.60 0
node_4 0.0 0.60 0.40 0
node_5 0.9 0.03 0.07 0
具体来说,我只想提取行号,其中列B在before
中最高,而列C在after
中最高,反之亦然,因为我正在尝试提取节点号在分类群移动了B-> C或C-> B的树中。
所以我正在寻找的输出类似于:
> BC
[1] 1 3
> CB
[1] 2 4
行中有B> C或C> B但行中哪一行都不是最高(node_5),我需要忽略它们。然后使用行号查询提供所需数据的单独数据帧。
我希望这一切都有道理。提前感谢任何建议!
答案 0 :(得分:1)
你可以这样做......
maxBefore <- apply(before, 1, which.max) #find highest columns before (by row)
maxAfter <- apply(after, 1, which.max) #and highest columns after
BC <- which(maxBefore==2 & maxAfter==3) #rows with B highest before, C after
CB <- which(maxBefore==3 & maxAfter==2) #rows with C highest before, B after
BC
node_1 node_3
1 3
CB
node_2 node_4
2 4