我正在使用simper
包中的vegan
函数。简而言之,simper
比较一组组,并计算哪些变量对它们的不相似性贡献最大,以及在给出累积贡献的名为cusum
的列中计算多少。输出是每个组间对比度及其结果的嵌套列表。例如
library(vegan)
library(data.table)
library(tidyr)
data(dune)
data(dune.env)
sim <- with(dune.env, simper(dune, Management))
simsum<-summary(sim)
#(short version of output)
$SF_BF
average sd ratio ava avb cumsum
Agrostol 0.061373875 0.034193273 1.7949108 4.6666667 0.0000000 0.09824271
Alopgeni 0.052667124 0.036475863 1.4438897 4.3333333 0.6666667 0.18254830
$SF_HF
average sd ratio ava avb cumsum
Agrostol 0.047380081 0.031272715 1.5150613 4.6666667 1.4 0.08350879
Alopgeni 0.046433015 0.032896891 1.4114712 4.3333333 1.6 0.16534834
$SF_NM
average sd ratio ava avb cumsum
Poatriv 0.078284148 0.040947182 1.9118324 4.6666667 0.0000000 0.1013601
Alopgeni 0.071219425 0.046958337 1.5166513 4.3333333 0.0000000 0.1935731
由此,我感兴趣1)每个嵌套列表的名称(即哪些组正在对比),2)rownames(即哪些变量有助于相异性),以及3)cusum列(即他们贡献了多少。)
我想把它变成一个对比矩阵,显示每个组间对比的前三个贡献变量,这样它更容易阅读,并且不会占用太多空间。这是我在excel中做的一个例子:
我怀疑这会很棘手,但这是我到目前为止所做的:
top3<-lapply(simsum, `[`,1:3,)#get top 3 contributors
cuss<-lapply(top3, `[`,6)#get last column
rows<-lapply(top3, rownames)#get names from list
rows2<-lapply(cuss, cumsum)#get values from list
rowsdf<-do.call(rbind, lapply(rows, data.frame, stringsAsFactors=FALSE))#names into df
cusumdf<-do.call(rbind, lapply(rows2, data.frame, stringsAsFactors=FALSE))#values into df
simperdf<-cbind(rowsdf,cusumdf) #combine into one df
colnames(simperdf)<-c('name','cusum') #change colnames
setDT(simperdf, keep.rownames = TRUE)[]#convert rownames to a column
simperdf<-separate(data = simperdf, col = rn, into = c("left", "right"), sep = "\\_")#seperate contrasts names
simperdf<-separate(data = simperdf, col = right, into = c("right", "delete"), sep = "\\.")#separate numbers
simperdf$delete<-NULL#delete number column
这给出了这个整洁的小数据帧:
left right name cusum
1: SF BF Agrostol 0.09824271
2: SF BF Alopgeni 0.28079100
3: SF BF Lolipere 0.54036058
4: SF HF Agrostol 0.08350879
5: SF HF Alopgeni 0.24885713
6: SF HF Lolipere 0.48820643
7: SF NM Poatriv 0.10136013
8: SF NM Alopgeni 0.29493318
9: SF NM Agrostol 0.56167145
10: BF HF Rumeacet 0.08163219
11: BF HF Poatriv 0.23357016
12: BF HF Planlanc 0.45275349
13: BF NM Lolipere 0.12427183
14: BF NM Poatriv 0.32348443
15: BF NM Poaprat 0.59466001
16: HF NM Poatriv 0.09913221
17: HF NM Lolipere 0.27381681
18: HF NM Rumeacet 0.51298871
但我不确定从哪里开始。我看到contrasts(dune.env$Management)
会给出矩阵的框架:
HF NM SF
BF 0 0 0
HF 1 0 0
NM 0 1 0
SF 0 0 1
但我不确定如何手动填充它。任何帮助将不胜感激。
答案 0 :(得分:1)
这不是你想要的,但我认为这是一个正确的方向:
require(tables)
test <- data.frame(left = c("SF", "SF", "BF", "BF"),
right = c("BF","BF", "SF", "SF"),
name = c("Agrostol", "Alopgeni","Agrostol", "Alopgeni2"),
cumv = c(1,2,3,4))
tabular(right * name ~ left * cumv * mean, data = test)
给出输出:
left
BF SF
cumv cumv
right name mean mean
BF Agrostol NaN 1
Alopgeni NaN 2
Alopgeni2 NaN NaN
SF Agrostol 3 NaN
Alopgeni NaN NaN
Alopgeni2 4 NaN