rcorr()函数用于相关

时间:2016-05-14 12:05:13

标签: r correlation

我在R:

中使用rcorr()函数构建两个不同矩阵之间的相关性
res <- rcorr(as.matrix(table1), as.matrix(table2),type="pearson")

它似乎工作正常,但我想避免表内相关 - 任何建议?

1 个答案:

答案 0 :(得分:6)

考虑使用R的基础cor()来表示两个集合之间的不同相关性,因为Hmisc的rcorr()返回所有可能的组合。请注意,rcorr()的右上象限(左下角对称重复)是cor()的整个结果(舍入到两个小数点)。

table1 <- matrix(rnorm(25),5)
table2 <- matrix(rnorm(25),5)

res <- rcorr(table1, table2, type="pearson")
res
       [,1]  [,2]  [,3]  [,4]  [,5]  | [,6]  [,7]  [,8]  [,9] [,10]
# [1,]  1.00 -0.55  0.95 -0.16  0.17 |-0.46  0.15  0.10  0.69  0.16
# [2,] -0.55  1.00 -0.55 -0.60 -0.79 |-0.45 -0.66 -0.22 -0.30  0.12
# [3,]  0.95 -0.55  1.00 -0.09  0.30 |-0.35 -0.05 -0.17  0.57 -0.03
# [4,] -0.16 -0.60 -0.09  1.00  0.91 | 0.92  0.53 -0.21 -0.58 -0.71
# [5,]  0.17 -0.79  0.30  0.91  1.00 | 0.78  0.41 -0.31 -0.32 -0.68
# ------------------------------------------------------------------
# [6,] -0.46 -0.45 -0.35  0.92  0.78 | 1.00  0.44 -0.14 -0.62 -0.58
# [7,]  0.15 -0.66 -0.05  0.53  0.41 | 0.44  1.00  0.68  0.13  0.13
# [8,]  0.10 -0.22 -0.17 -0.21 -0.31 |-0.14  0.68  1.00  0.59  0.80
# [9,]  0.69 -0.30  0.57 -0.58 -0.32 |-0.62  0.13  0.59  1.00  0.80
#[10,]  0.16  0.12 -0.03 -0.71 -0.68 |-0.58  0.13  0.80  0.80  1.00

# pvalues to follow ...

res <- cor(table1, table2, method="pearson")
res

#            [,1]        [,2]       [,3]       [,4]        [,5]
# [1,] -0.4551474  0.15080994  0.1008215  0.6894955  0.16390813
# [2,] -0.4468285 -0.66209106 -0.2154960 -0.2954581  0.11662382
# [3,] -0.3542023 -0.05474287 -0.1720881  0.5669501 -0.02880113
# [4,]  0.9246330  0.53456574 -0.2084105 -0.5807386 -0.71108552
# [5,]  0.7788395  0.40551828 -0.3122606 -0.3209273 -0.67912147

唯一需要注意的是显着性测试统计数据,包括cor()无法获得t-stats和p值。但是,可以使用cor.test()检索它们,您可以使用mapply()迭代运行它们。下面演示了一个测试配对,并针对所有其他列进行了推广。请注意,测试的估计值与cor()输出中的值相对应。

# EXAMPLE OF FIRST COL PAIRING
res <- cor.test(table1[,1], table2[,1], method="pearson")
res

#   Pearson's product-moment correlation

# data:  table1[, 1] and table2[, 1]
# t = -0.88536, df = 3, p-value = 0.4412
# alternative hypothesis: true correlation is not equal to 0
# 95 percent confidence interval:
#  -0.9542314  0.7137222
# sample estimates:
#        cor 
# -0.4551474

# OBTAIN ALL MATRIX COL COMBINATIONS
tblcols <- expand.grid(1:ncol(table1), 1:ncol(table2))

# MAPPLY COR.TEST ACROSS ALL COLS
cfunc <- function(var1, var2) {
              cor.test(table1[,var1], table2[,var2], method="pearson")
         }

res <- mapply(function(a,b) {
                 cfunc(var1 = a, var2 = b)
        }, tblcols$Var1, tblcols$Var2)

head(res)

#             [,1]        [,2]        [,3]        [,4]       
# statistic   -0.8853596  -0.8650936  -0.6560274  4.204994   
# parameter   3           3           3           3          
# p.value     0.4411699   0.4506234   0.5586316   0.02455469 
# estimate    -0.4551474  -0.4468285  -0.3542023  0.924633   
# null.value  0           0           0           0          
# alternative "two.sided" "two.sided" "two.sided" "two.sided"
#             [,5]        [,6]        [,7]        [,8]       
# statistic   2.150733    0.2642326   -1.53021    -0.09495982
# parameter   3           3           3           3          
# p.value     0.1206246   0.8087132   0.2234562   0.930334   
# estimate    0.7788395   0.1508099   -0.6620911  -0.05474287
# null.value  0           0           0           0          
# alternative "two.sided" "two.sided" "two.sided" "two.sided"
# ...