从solve()和ginv()函数中得到错误的逆矩阵

时间:2016-07-24 14:51:35

标签: r matrix-inverse

我尝试计算R中协方差矩阵的逆。我使用solveginv函数但是我无法正确找到单位矩阵。我想知道为什么会这样。以前有人遇到过这个问题吗?如何解决?

d <- data.frame(r1,r2,r3,r4)
ad <- cov(d) # get covariance

您可以通过以下方式访问ad

ad <- structure(c(0.000564103047135273, 0.000209426917735389, 
 -2.50601812852379e-07, 0.000318692159506722, 0.000209426917735389, 
 8.92756413718721e-05, -9.42226640483041e-08, 0.000135853820739977, 
 -2.50601812852379e-07, -9.42226640483041e-08, 1.12190005351388e-10, 
 -1.43381875666865e-07, 0.000318692159506722, 0.000135853820739977, 
 -1.43381875666865e-07, 0.000206733441799332), 
  .Dim = c(4L, 4L), 
  .Dimnames = list(c("r1", "r2", "r3", "r4"), 
                   c("r1", "r2", "r3", "r4")))

#       r1            r2            r3            r4
#r1  5.641030e-04  2.094269e-04 -2.506018e-07  3.186922e-04
#r2  2.094269e-04  8.927564e-05 -9.422266e-08  1.358538e-04
#r3 -2.506018e-07 -9.422266e-08  1.121900e-10 -1.433819e-07
#r4  3.186922e-04  1.358538e-04 -1.433819e-07  2.067334e-04

a <- ginv(ad,tol = 1e-18) `# calculate the inverse of the matrix by ginv` 
#         [,1]         [,2]         [,3]        [,4]
#[1,] 2.369507e+05     7332.961 5.497029e+08    11158.82
#[2,] 7.332964e+03     9194.958 4.198473e+07    13992.28
#[3,] 5.497029e+08 41984725.523 1.353713e+12 63889594.47
#[4,] 1.115881e+04    13992.283 6.388959e+07    21292.54

b <- solve(ad,tol = 1e-18) # calculate the inverse of the matrix by solve   
#         r1            r2            r3            r4
#r1    236950.7  1.701817e+06     549702919 -1.099090e+06
#r2   1184944.4 -2.245684e+20    2905309940  1.475740e+20
#r3 549702919.1  4.213828e+09 1353713017167 -2.677616e+09
#r4   -762702.5  1.475740e+20   -1817729881 -9.697747e+19

ad%*%a
#  r1         r2          r3           r4
#r1  1.000000e+00 3.552714e-15 4.729372e-11 -8.881784e-16
#r2  9.547918e-15 3.015976e-01 5.820766e-11  4.589515e-01
#r3 -2.385245e-18 3.234236e-12 1.000000e+00 -2.125362e-12
#r4 -8.881784e-15 4.589515e-01 6.002665e-11  6.984024e-01

ad%*%b
#  r1    r2          r3           r4
#r1 1.000000e+00  0 0.000000e+00  0.000000000
#r2 1.421085e-14  0 2.910383e-11  0.000000000
#r3 0.000000e+00  0 1.000000e+00 -0.001953125
#r4 2.842171e-14 -4 0.000000e+00  4.000000000

1 个答案:

答案 0 :(得分:5)

高度共线变量是回归方法核心矩阵计算中数值稳定性的威胁。评估多重共线性程度的一种方法是计算方差膨胀因子(VIF)。此代码是从Harrell的vif包中的rms函数中提取的:

library(rms)
dput(ad)
structure(c(0.0046716674, 0.0017256716, -0.0001918083, 0.0027385111, 
0.001725672, 0.00073037, -8.1816e-05, 0.001159042, -0.0001918083, 
-8.1816e-05, 9.169343e-06, -0.0001298359, 0.0027385111, 0.0011590423, 
-0.0001298359, 0.0018393131), .Dim = c(4L, 4L), .Dimnames = list(
    c("r1", "r2", "r3", "r4"), c("r1", "r2", "r3", "r4")))

 nam <- dimnames(ad)[[1]]
 d <- diag(ad)^0.5
    vif.vals <- diag(solve(ad/(d %o% d)))
    names(vif.vals) <- nam
    vif.vals
#---
          r1           r2           r3           r4 
6.535378e+01 3.338701e+06 1.768640e+04 3.389362e+06 

当您看到VIF高于10时,经常会担心经验。在此基础上,您的VIF是天文数字。 (使用你原来的评论后,最大VIF值甚至更高,两个值为1.185158e + 14,并且它们都远高于10。)