为什么在" Varimax"之间存在psych :: principal之间的差异?和" varimax"?

时间:2015-09-02 11:21:35

标签: r matrix pca psych

related question中,我问过为什么stats::varimaxGPArotation::Varimax之间存在差异,这两个psych::principal调用,具体取决于为{{1}设置的选项}}

这两者之间的差异(参见其他问题)解释了rotate =的一些但不是所有的差异。 似乎这些差异以某种方式被psych::principal 加剧了。 (我有一个简单的理论原因,我希望得到证实)。

psych::principal

这对于浮点人工制品来说似乎太大了。

我的假设是(额外的)差异源于 library(GPArotation) library(psych) data("Thurstone") principal.unrotated <- principal(r = Thurstone, nfactors = 4, rotate = "none") # find unrotated PCs first loa <- unclass(principal.unrotated$loadings) # just to compare that the rot.mat is correct varimax.stats <- stats::varimax(x = loa, normalize = TRUE) varimax.GPA <- GPArotation::Varimax(L = loa, normalize = TRUE) # notice we're here NOT interested in the difference between stats and GPA, that's the other question diff.from.rot.meth <- unclass(varimax.stats$loadings - varimax.GPA$loadings) # very small differences, see this question: https://stackoverflow.com/questions/32350891/why-are-there-differences-between-gparotationvarimax-and-statsvarimax mean(abs(diff.from.rot.meth)) #> [1] 8.036863e-05 principal.varimax.stats <- principal(r = Thurstone, nfactors = 4, rotate = "varimax") principal.Varimax.GPA <- principal(r = Thurstone, nfactors = 4, rotate = "Varimax") diff.from.princ <- principal.Varimax.GPA$rot.mat - principal.varimax.stats$rot.mat # quite a substantial change, because Theta is NOT a rotmat, that makes sense mean(abs(diff.from.princ)) #> [1] 0.021233 mean(abs(diff.from.rot.meth)) - mean(abs(diff.from.princ)) # principal has MUCH bigger differences #> [1] -0.02115263 默认为(Kaiser)GPArotation::Varimax,而** normalize == FALSE默认为(Kaiser) stats::varimax 不能在`principal :: psych``中设置不同。

normalize == TRUE手册:

  

stats::varimax

## varimax with normalize = TRUE is the default / GPArotation::Varimax手册:

  

参数normalize指示在旋转之前是否以及如何进行任何归一化,然后在旋转之后撤消归一化。如果normalize为FALSE(默认值),则不进行规范化。如果normalize为TRUE,则完成Kaiser归一化。 (因此将归一化A和的行条目平方为1.0。这有时称为Horst归一化。)

此外,他们GPArotation::GPForth手动警告:

  

GPArotation包没有(默认情况下)规范化,fa功能也没有。然后,为了使其更加混乱,统计数据中的varimax确实如此,GPArotation中的Varimax不会。

任何人都可以确认差异实际上是由规范化选项来解释的吗?

2 个答案:

答案 0 :(得分:1)

由于程序的准确性不同,会出现不同加载的问题(当然,因为在mental :: principal中normalize选项未被评估,所有其他过程必须与切换选项一起使用为TRUE)。虽然可以配置stats :: varimax和GPArotation :: Varimax的精度(参数eps),但在mental :: principal中忽略了这一点,并且似乎与stats :: varimax隐含相等,eps = 1e-5 。

如果我们将stats :: varimax和GPArotations :: Varimax的准确度提高到eps=1e-15那么我们得到的结果相同(至少8位数),就像我在MatMate程序中的同意实现一样经过测试,它也非常准确地等于SPSS计算。

在psych :: principal中缺少对explicite选项eps的处理似乎是一个错误,其糟糕的隐含默认值肯定是不满意的。

有趣的是,GPArotation :: Varimax需要与eps=1e-15进行很多轮换,最后看输出;所以要么实现了另一个内部过程,要么在可以停止迭代时在决策中对eps参数进行不同的评估。一个例子,请参见本答案的最后,可能会提出这样的结果。

低于比较协议;仅显示加载的前两行。

The first three computations with low accuracy (eps=1e-5) 
all procedures give equal or comparable results, 
except GPArotation, which is already near the exact answer
--------------------------------
>  principal(r = Thurstone, nfactors = 4, rotate = "varimax")$loadings    *1000000
>  stats::varimax(x = loa, normalize = TRUE,eps=1e-5)$loadings            *1000000
>  GPArotation::Varimax(L = loa, normalize = TRUE,eps=1e-5)$loadings      *1000000

The second three computations with (attempted) high accuracy (eps=1e-15) 
all procedures except psych::principal give equal results, and they
agree also with external crosscheck using MatMate
--------------------------------
>  principal(r = Thurstone, nfactors = 4, rotate = "varimax",eps=1e-15)$loadings*1000000
>  stats::varimax(x = loa, normalize = TRUE,eps=1e-15)$loadings*1000000
>  GPArotation::Varimax(L = loa, normalize = TRUE,eps=1e-10)$loadings*1000000

尝试使用大eps或没有

的小/默认准确结果
# ===== Numerical documentation (only first two rows are displayed)================
> principal(r = Thurstone, nfactors = 4, rotate = "varimax")$loadings*1000000
                RC1       RC2       RC3       RC4      
Sentences       871655.72 216638.46 198427.07 175202.57
Vocabulary      855609.28 294166.99 153181.45 180525.99


> stats::varimax(x = loa, normalize = TRUE,eps=1e-5)$loadings         *1000000
                PC1       PC2       PC3       PC4      
Sentences       871655.72 216638.46 198427.07 175202.57
Vocabulary      855609.28 294166.99 153181.45 180525.99

> GPArotation::Varimax(L = loa, normalize = TRUE,eps=1e-5)$loadings   *1000000
                     PC1      PC2      PC3       PC4
Sentences       871717.3 216618.7 198176.3 175204.47
Vocabulary      855663.1 294146.3 152930.7 180517.21
# =============================================================================

现在尝试使用较小的eps

获得更准确的结果
> principal(r = Thurstone, nfactors = 4, rotate = "varimax",eps=1e-15)$loadings  *1000000
                RC1       RC2       RC3       RC4      
Sentences       871655.72 216638.46 198427.07 175202.57
Vocabulary      855609.28 294166.99 153181.45 180525.99

> stats::varimax(x = loa, normalize = TRUE,eps=1e-15)$loadings                   *1000000
                PC1       PC2       PC3       PC4      
Sentences       871716.83 216619.69 198174.31 175207.86
Vocabulary      855662.58 294147.47 152928.77 180519.37

> GPArotation::Varimax(L = loa, normalize = TRUE,eps=1e-10)$loadings             *1000000
                     PC1      PC2       PC3       PC4
Sentences       871716.8 216619.7 198174.31 175207.86
Vocabulary      855662.6 294147.5 152928.77 180519.37

Warnmeldung:
In GPForth(L, Tmat = Tmat, method = "varimax", normalize = normalize,  :
  convergence not obtained in GPForth. 1000 iterations used.

# Result by MatMate: --------------------------------------------------------
 lad = cholesky(Thurstone) 
 pc = rot(lad,"pca")
 pc4 = pc[*,1..4]                           // arrive at the first four pc's
     t = gettrans( normzl(pc4),"varimax")   // get rotation-matrix for row-normalized pc's
 vmx = pc4 * t                              // rotate pc4 by rotation-matrix 
 display = vmx     * 1000000
                     PC1      PC2       PC3       PC4
Sentences       871716.83    216619.68   198174.31   175207.87
Vocabulary      855662.58    294147.46   152928.77   180519.37


# ===============================================================================> 

stats :: varimax和GPArotation :: Varimax的结果匹配更好,可以通过在GP :: 1中将stat设置为1e-12,在GPArotation中设置为1e-6,其中一个值是其他。我们得到

> GPArotation::Varimax(L = loa, normalize = TRUE,eps=1e-6)$loadings*1000000
                     PC1      PC2       PC3       PC4
Sentences       871716.8 216619.8 198174.49 175207.63
Vocabulary      855662.5 294147.6 152928.94 180519.21

> stats::varimax(x = loa, normalize = TRUE,eps=1e-12)$loadings*1000000
                PC1       PC2       PC3       PC4      
Sentences       871716.80 216619.74 198174.40 175207.85
Vocabulary      855662.55 294147.52 152928.86 180519.36

答案 1 :(得分:0)

这似乎证实,应用psych::kaiser(我认为是为此目的而构建)会将差异缩小回stats::varimaxGPArotation::Varimax之间的原始差异:< / p>

principal.Varimax.GPA.kaiser <- kaiser(f = principal.unrotated, rotate = "Varimax")
diff.statsvari.gpavar.bothkaiser <- unclass(principal.Varimax.GPA.kaiser$loadings - principal.varimax.stats$loadings)
mean(abs(diff.statsvari.gpavar.bothkaiser))
#> [1] 8.036863e-05

这几乎是相同的结果,所以我认为假设已经确认

psych::principal产生的较大差异是因为normalize的默认值不同。

<强>更新

对于相应的旋转矩阵(或任何Th),差异也(再次)小得多:

principal.Varimax.GPA.kaiser$Th - principal.varimax.stats$rot.mat  # those differences are very small now, too
#>               [,1]         [,2]          [,3]          [,4]
#> [1,]  1.380279e-04 1.380042e-05 -2.214319e-04 -2.279170e-06
#> [2,]  9.631517e-05 2.391296e-05  1.531723e-04 -3.371868e-05
#> [3,]  1.758299e-04 7.917460e-05  6.788867e-05  1.099072e-04
#> [4,] -9.548010e-05 6.500162e-05 -1.679753e-05 -5.213475e-05