R,基于共生矩阵计算指数

时间:2018-02-16 10:27:41

标签: r

我有一个物种矩阵出现在遗址中,我想为物种的每对ab计算以下公式: enter image description here

其中RaRb分别是ab种的出现,S ab的网站数量{1}}共同发生。

到目前为止,我的解决方案非常慢(实际上我的矩阵太慢了):

set.seed(1)
# Example of binary matrix with sites in rows and species in columns
mat <- matrix(runif(200), ncol = 20)
mat_bin <- mat
mat_bin[mat_bin > 0.5] <- 1
mat_bin[mat_bin <= 0.5] <- 0
rownames(mat_bin) <- paste0("site_", seq(1:nrow(mat_bin)))
colnames(mat_bin) <- paste0("sp_", seq(1:ncol(mat_bin)))

# Number of occurrences for every species
nbocc <- colSums(mat_bin)

# Number of cooccurrences between species
S <- crossprod(mat_bin)
diag(S) <- 0 

# Data frame with all the pair combinations
comb <- data.frame(t(combn(colnames(mat_bin), 2)))
colnames(comb) <- c("sp1", "sp2")
comb$Cscore <- 0

# Slow for_loop to compute the Cscore of each pair
for(i in 1:nrow(comb)){
  num <- (nbocc[[comb[i, "sp1"]]] - S[comb[i, "sp1"], comb[i, "sp2"]]) *
    (nbocc[[comb[i, "sp2"]]] - S[comb[i, "sp1"], comb[i, "sp2"]])

  denom <- nbocc[[comb[i, "sp1"]]] * nbocc[[comb[i, "sp2"]]]

  comb[i, "Cscore"] <- num/denom
}

第一个解决方案可能是并行化for-loop,但可能存在更优化的解决方案。

1 个答案:

答案 0 :(得分:1)

就像您从S开始一样,您可以基于矩阵以矢量化方式进行完整计算。

这看起来如下:

set.seed(1)
# Example of binary matrix with sites in rows and species in columns
mat <- matrix(runif(200), ncol = 20)
mat_bin <- mat
mat_bin[mat_bin > 0.5] <- 1
mat_bin[mat_bin <= 0.5] <- 0
rownames(mat_bin) <- paste0("site_", seq(1:nrow(mat_bin)))
colnames(mat_bin) <- paste0("sp_", seq(1:ncol(mat_bin)))

# Number of occurrences for every species
nbocc <- colSums(mat_bin)

# Number of cooccurrences between species
S <- crossprod(mat_bin)

resMat <- (nbocc - S) * t(nbocc - S) / 
  outer(nbocc, nbocc, `*`)

# in the end you would need just the triangle
resMat[lower.tri(resMat)]