我正在将算法从R转换为C,我需要获得矩阵的伪逆,但是我在C中获得的结果与我在R中获得的结果有一些差异。这些差异改变了算法的行为。
我用来获取C中的伪逆的代码是this。
我做了一些阅读,并且有不同的方法来获得伪逆,C中使用的方法是Moore-Penrose。 R中使用的函数来自库corpcor。两者都使用“奇异值分解”。
这是我想从中得到伪逆的矩阵
1 0.920980394593472 0.996160973582776 0.996772980609752 0.997372221594439 0.999972797627027
0.920980394593472 1 0.885601439824631 0.88878682654952 0.892173764646865 0.923738536637407
0.996160973582776 0.885601439824631 1 0.999973383442349 0.999885329646229 0.99549326808266
0.996772980609752 0.88878682654952 0.999973383442349 1 0.999969202115456 0.996158288591094
0.997372221594439 0.892173764646865 0.999885329646229 0.999969202115456 1 0.996814694067663
0.999972797627027 0.923738536637407 0.99549326808266 0.996158288591094 0.996814694067663 1
我从R中的函数pseudoinverse()得到的结果是:
1398676681.0709 79599.9582612864 -9585774352.21759 28302547195.6681 -19807136596.5434 -305910496.668656
79591.4731051894 3401.1232804516 52529359.4133139 -126479191.665267 76425077.4778451 -2563699.8428373
-9585920775.52777 52529288.3510008 1003916837759.99 -2454016116733.34 1501977763514.61 -42460326831.3218
28302900052.1238 -126478989.043282 -2454015575342.32 6017016899314.95 -3692050079960.62 101159202486.608
-19807349974.7679 76424938.7106429 1501977155911.81 -3692049404688.94 2270196092100.53 -60571139669.4392
-305903527.744471 -2563701.10409161 -42460406960.0488 101159421351.019 -60571285357.0572 2184863920.31107
我在C中得到的结果是:
1398795243.74255 79184.33844201 -9594022229.12525 28322858223.2099 -19819644215.1338 -305583186.690388
79166.91917247 3402.48426033 52556628.829717 -126546466.939768 76466567.769084 -2564764.38775363
-9594334089.78616 52556515.9039231 1004461808180.58 -2455360323666.24 1502806633291.96 -42481639977.8112
28323609294.95 -126546129.049526 -2455359143404.21 6020330778543.35 -3694093433789.59 101211765648.895
-19820098170.0141 76466329.4304944 1502805309171.23 -3694091962863.6 2271455511686.72 -60603547743.7687
-305568392.855205 -2564768.40243798 -42481807759.1065 101212225714.588 -60603854784.616 2185698311.36118
两者之间的差异是:(R-C)
-118562.671649933 415.6198192764 8247876.90765953 -20311027.5418015 12507618.5904007 -327309.978267968
424.5539327194 -1.3609798784 -27269.4164030999 67275.2745009959 -41490.291238904 1064.5449163299
8413314.25839043 -27227.552922301 -544970420.589966 1344206932.90039 -828869777.349854 21313146.4894028
-20709242.8262024 67140.0062440038 1343568061.89014 -3313879228.39941 2043353828.96973 -52563162.2870026
12748195.2462006 -41390.7198514938 -828153259.419922 2042558174.66016 -1259419586.19043 32408074.3294983
-335134.889266014 1067.29834637 21400799.0577011 -52804363.5690002 32569427.5587997 -834391.050109863
为了检查我在C中使用的算法是否存在问题,我在python中使用numpy.linalg.pinv()使用“奇异值分解”得到了伪逆。结果与C和R不同。
1398224882.37767 81521.32618159 -9548319116.82994 28210636794.0452 -19750702778.4149 -307443670.558374
81576.67749763 3392.80756354 52367028.3401356 -126080750.377468 76180379.3995419 -2557069.77374461
-9547349936.09641 52367486.8455529 1000758728845.37 -2446264734953.02 1497217439225.67 -42331313003.6236
28208301799.8629 -126082060.163116 -2446268326785.52 5998001838415.43 -3680372478514.1 100842703532.378
-19749291055.22 76181277.4796568 1497221470187.79 -3680376958173.79 2263027785174.03 -60376849475.2803
-307489737.200422 -2557061.32729561 -42330783514.2789 100841257137.344 -60375886615.3659 2179570267.21681
编辑我犯了一个错误,我没有把矩阵包含所有数字来重新创建结果,我用正确的矩阵更新了问题。
答案 0 :(得分:2)
A 的generalized inverse A g 应该符合
A g A A g = A 克
A A g A = A
( A A g ) T = A g A
( A g A ) T = A A 克
对于给定的矩阵,corpcor::pseudoinverse
的结果不满足这些属性,而MASS::ginv
的结果是:
check_pinv <- function(mat, fun, ...) {
pinv <- fun(mat, ...)
isTRUE(all.equal(mat %*% pinv %*% mat, mat)) &&
isTRUE(all.equal(pinv %*% mat %*% pinv, pinv)) &&
isTRUE(all.equal(pinv %*% mat, t(mat %*% pinv))) &&
isTRUE(all.equal(mat %*% pinv, t(pinv %*% mat)))
}
mat <- matrix(c(
1, 0.920980394593472, 0.996160973582776, 0.996772980609752, 0.997372221594439, 0.999972797627027,
0.920980394593472, 1, 0.885601439824631, 0.88878682654952, 0.892173764646865, 0.923738536637407,
0.996160973582776, 0.885601439824631, 1, 0.999973383442349, 0.999885329646229, 0.99549326808266,
0.996772980609752, 0.88878682654952, 0.999973383442349, 1, 0.999969202115456, 0.996158288591094,
0.997372221594439, 0.892173764646865, 0.999885329646229, 0.999969202115456, 1, 0.996814694067663,
0.999972797627027, 0.923738536637407, 0.99549326808266, 0.996158288591094, 0.996814694067663, 1), nrow = 6, ncol = 6)
check_pinv(mat, corpcor::pseudoinverse)
#> [1] FALSE
check_pinv(mat, MASS::ginv)
#> [1] TRUE
这两个函数之间的一个重要区别是默认容差级别,用于确定是否应将奇异值视为零。如果对MASS::ginv
也使用sqrt(.Machine$double.eps)
(即 corpcor::pseudoinverse
)中使用的值,则会实现伪逆属性:
check_pinv(mat, corpcor::pseudoinverse, max(svd(mat)$d) * sqrt(.Machine$double.eps))
#> [1] TRUE
请注意,必须使用max(svd(mat)$d) * sqrt(.Machine$double.eps)
,因为corpcor::pseudoinverse
在绝对意义上解释了容差,而MASS::ginv
将容差视为相对于最大奇异值。使用此容差级别,产生的伪逆矩阵是相同的。
all.equal(corpcor::pseudoinverse(mat, max(svd(mat)$d) * sqrt(.Machine$double.eps)),
MASS::ginv(mat))
#> [1] TRUE
在python中,numpy.linalg.pinv
和scipy.linalg.pinv
都不满足这些属性:
import numpy
mat = numpy.array([[1, 0.9209803946, 0.9961609736, 0.9967729806, 0.9973722216, 0.9999727976],
[0.9209803946, 1, 0.8856014398, 0.8887868265, 0.8921737646, 0.9237385366],
[0.9961609736, 0.8856014398, 1, 0.9999733834, 0.9998853296, 0.9954932681],
[0.9967729806, 0.8887868265, 0.9999733834, 1, 0.9999692021, 0.9961582886],
[0.9973722216, 0.8921737646, 0.9998853296, 0.9999692021, 1, 0.9968146941],
[0.9999727976, 0.9237385366, 0.9954932681, 0.9961582886, 0.9968146941, 1]])
pinv1 = numpy.linalg.pinv(mat)
print numpy.allclose(pinv1.dot(mat).dot(pinv1), pinv1)
# False
print numpy.allclose(mat.dot(pinv1).dot(mat), mat)
# True
from scipy import linalg
pinv2 = linalg.pinv(mat)
print numpy.allclose(pinv2.dot(mat).dot(pinv2), pinv2)
# False
print numpy.allclose(mat.dot(pinv2).dot(mat), mat)
# False
print numpy.allclose(pinv1, pinv2)
# True
注意: Matrix使用原始值。结果不受影响,因为只有最小的奇异值才会显示出显着的变化。
同样,如果使用1e-8而不是默认的1e-15作为容差,则会满足这些伪逆属性。对于C版本也是如此,可以从R和RcppGSL一起使用。