为相同的向量组合获取不同的相关值

时间:2015-01-31 06:36:01

标签: r correlation

为什么我在下面的同一组合中获得不同的相关性?

> cor(finalDB[2:6],use="complete.obs")

                rocky1Rating rocky2Rating rocky3Rating rocky4Rating rocky5Rating
rocky1Rating    1.0000000    ***0.6476523***    0.5435555    0.4964198    0.3483168

rocky2Rating    0.6476523    1.0000000    0.7507204    0.6653651    0.5288312

rocky3Rating    0.5435555    0.7507204    1.0000000    0.7284123    0.5897088

rocky4Rating    0.4964198    0.6653651    0.7284123    1.0000000    0.6006595

rocky5Rating    0.3483168    0.5288312    0.5897088    0.6006595    1.0000000
> cor(finalDB[2],finalDB[3],use = "complete.obs")

             rocky2Rating
rocky1Rating    ***0.6011554***

1 个答案:

答案 0 :(得分:3)

问题可能是数据集中的NA值。设置use="complete.obs"并将其应用于两列以上时,它仅使用不丢失所有列的行。如果您只想跳过唯一列对的缺失值,请设置use="pairwise.complete.obs"。例如

set.seed(15)
mm<-matrix(runif(6*6), nrow=6)
mm[cbind(4:6, 1:3)]<-NA

cor(mm, use="complete.obs")
#              [,1]       [,2]        [,3]         [,4]       [,5]        [,6]
# [1,]  1.000000000  0.7577650  0.41079822  0.004065102 -0.9221867  0.86947546
# [2,]  0.757764997  1.0000000 -0.28363801 -0.649441771 -0.4464391  0.98119111
# [3,]  0.410798223 -0.2836380  1.00000000  0.913388689 -0.7314382 -0.09319206
# [4,]  0.004065102 -0.6494418  0.91338869  1.000000000 -0.3904905 -0.49043755
# [5,] -0.922186730 -0.4464391 -0.73143818 -0.390490510  1.0000000 -0.61077597
# [6,]  0.869475459  0.9811911 -0.09319206 -0.490437552 -0.6107760  1.00000000

cor(mm, use="pairwise.complete.obs")
#            [,1]        [,2]        [,3]       [,4]       [,5]       [,6]
# [1,]  1.0000000  0.70156571  0.50955114 -0.2663486 -0.7637746  0.7643575
# [2,]  0.7015657  1.00000000 -0.01542302 -0.2882218 -0.5666432  0.1206862
# [3,]  0.5095511 -0.01542302  1.00000000  0.8922900 -0.8904275 -0.5660903
# [4,] -0.2663486 -0.28822185  0.89229002  1.0000000 -0.4693979 -0.7574680
# [5,] -0.7637746 -0.56664323 -0.89042748 -0.4693979  1.0000000  0.2974870
# [6,]  0.7643575  0.12068622 -0.56609027 -0.7574680  0.2974870  1.0000000

cor(mm[,1], mm[,2], use="complete.obs")
# [1] 0.7015657

注意最后两个结果是如何匹配的。有关详细信息,请阅读?cor帮助页。