从矩阵(nxm)开始,我想创建一个新的Matrix mxm,其中包含起始矩阵的列的排列之间的相关性2.因此,如果我的输入是Matrix 3x3,我想计算列12,13,23的相关性,并将结果分配给目标矩阵。我实际上使用了两个嵌套for循环(~O(n^2)
)
for (i in 1:n) {
for (j in i+1:n) {
if (j <= n) {
tmp = cor(inMatrix[, i], inMatrix[, j])
dstMatrix[i,j] = tmp;
}
}
}
这似乎有效,我想知道是否存在更好的方法来实现它。
答案 0 :(得分:3)
简单的cor(inMatrix)
做到了(整个矩阵直接传递给cor()
):
n <- 7
m <- 5
set.seed(123)
inMatrix <- replicate(m, sample(c(1, - 1), 1) * cumsum(runif(n)))
inMatrix
# [,1] [,2] [,3] [,4] [,5]
# [1,] 0.7883051 -0.4566147 0.04205953 -0.7085305 -0.7954674
# [2,] 1.1972821 -1.4134481 0.36998025 -1.2525965 -0.8200811
# [3,] 2.0802995 -1.8667822 1.32448390 -1.8467385 -1.2978771
# [4,] 3.0207667 -2.5443529 2.21402322 -2.1358983 -2.0563366
# [5,] 3.0663232 -3.1169863 2.90682662 -2.2830119 -2.2727445
# [6,] 3.5944287 -3.2199110 3.54733344 -3.2460361 -2.5909256
# [7,] 4.4868478 -4.1197359 4.54160321 -4.1483352 -2.8225513
dstMatrix <- matrix(nrow = m, ncol = m)
for (i in 1:(m - 1)) {
for (j in (i+1):m) {
if (j <= n) {
tmp = cor(inMatrix[, i], inMatrix[, j])
dstMatrix[i,j] = tmp;
}
}
}
dstMatrix
# [,1] [,2] [,3] [,4] [,5]
# [1,] NA -0.9823516 0.9902370 -0.9688212 -0.9825973
# [2,] NA NA -0.9811424 0.9570599 0.9626469
# [3,] NA NA NA -0.9742235 -0.9862355
# [4,] NA NA NA NA 0.9331879
# [5,] NA NA NA NA NA
dstMatrix_2 <- cor(inMatrix)
dstMatrix_2
# [,1] [,2] [,3] [,4] [,5]
# [1,] 1.0000000 -0.9823516 0.9902370 -0.9688212 -0.9825973
# [2,] -0.9823516 1.0000000 -0.9811424 0.9570599 0.9626469
# [3,] 0.9902370 -0.9811424 1.0000000 -0.9742235 -0.9862355
# [4,] -0.9688212 0.9570599 -0.9742235 1.0000000 0.9331879
# [5,] -0.9825973 0.9626469 -0.9862355 0.9331879 1.0000000
dstMatrix == dstMatrix_2
# [,1] [,2] [,3] [,4] [,5]
# [1,] NA TRUE TRUE FALSE TRUE
# [2,] NA NA TRUE FALSE TRUE
# [3,] NA NA NA FALSE TRUE
# [4,] NA NA NA NA FALSE
# [5,] NA NA NA NA NA
# The difference lies in machine precision magnitude, not sure what caused it:
dstMatrix - dstMatrix_2
# [,1] [,2] [,3] [,4] [,5]
# [1,] NA 0 0 -1.110223e-16 0.000000e+00
# [2,] NA NA 0 2.220446e-16 0.000000e+00
# [3,] NA NA NA -1.110223e-16 0.000000e+00
# [4,] NA NA NA NA 1.110223e-16
# [5,] NA NA NA NA NA
答案 1 :(得分:1)
计算列组合的相关系数。 combn
函数用于获取列号对
根据@Sotos,函数可以直接传递给combn,因此可以避免使用apply()
cor_vals <- combn(1:col_n, 2, function(x) cor(mat1[, x[1]], mat1[, x[2]]))
# cor_vals <- apply(combn(1:col_n, 2), 2, function(x) cor(mat1[, x[1]], mat1[, x[2]]))
为相关值指定名称
cor_vals <- setNames(cor_vals, combn(1:col_n, 2, paste0, collapse = ''))
cor_vals
# 12 13 23
# 0.1621491 -0.8211970 0.4299367
数据:强>
set.seed(1L)
row_n <- 3
col_n <- 3
mat1 <- matrix(runif(row_n * col_n, min = 0, max = 20), nrow = row_n, ncol = col_n)