我有一个物种矩阵出现在遗址中,我想为物种的每对ab
计算以下公式:
其中Ra
和Rb
分别是a
和b
种的出现,S
a
和b
的网站数量{1}}共同发生。
到目前为止,我的解决方案非常慢(实际上我的矩阵太慢了):
set.seed(1)
# Example of binary matrix with sites in rows and species in columns
mat <- matrix(runif(200), ncol = 20)
mat_bin <- mat
mat_bin[mat_bin > 0.5] <- 1
mat_bin[mat_bin <= 0.5] <- 0
rownames(mat_bin) <- paste0("site_", seq(1:nrow(mat_bin)))
colnames(mat_bin) <- paste0("sp_", seq(1:ncol(mat_bin)))
# Number of occurrences for every species
nbocc <- colSums(mat_bin)
# Number of cooccurrences between species
S <- crossprod(mat_bin)
diag(S) <- 0
# Data frame with all the pair combinations
comb <- data.frame(t(combn(colnames(mat_bin), 2)))
colnames(comb) <- c("sp1", "sp2")
comb$Cscore <- 0
# Slow for_loop to compute the Cscore of each pair
for(i in 1:nrow(comb)){
num <- (nbocc[[comb[i, "sp1"]]] - S[comb[i, "sp1"], comb[i, "sp2"]]) *
(nbocc[[comb[i, "sp2"]]] - S[comb[i, "sp1"], comb[i, "sp2"]])
denom <- nbocc[[comb[i, "sp1"]]] * nbocc[[comb[i, "sp2"]]]
comb[i, "Cscore"] <- num/denom
}
第一个解决方案可能是并行化for-loop
,但可能存在更优化的解决方案。
答案 0 :(得分:1)
就像您从S
开始一样,您可以基于矩阵以矢量化方式进行完整计算。
这看起来如下:
set.seed(1)
# Example of binary matrix with sites in rows and species in columns
mat <- matrix(runif(200), ncol = 20)
mat_bin <- mat
mat_bin[mat_bin > 0.5] <- 1
mat_bin[mat_bin <= 0.5] <- 0
rownames(mat_bin) <- paste0("site_", seq(1:nrow(mat_bin)))
colnames(mat_bin) <- paste0("sp_", seq(1:ncol(mat_bin)))
# Number of occurrences for every species
nbocc <- colSums(mat_bin)
# Number of cooccurrences between species
S <- crossprod(mat_bin)
resMat <- (nbocc - S) * t(nbocc - S) /
outer(nbocc, nbocc, `*`)
# in the end you would need just the triangle
resMat[lower.tri(resMat)]