在related question中,我问过为什么stats::varimax
和GPArotation::Varimax
之间存在差异,这两个psych::principal
调用,具体取决于为{{1}设置的选项}}
这两者之间的差异(参见其他问题)解释了rotate =
的一些但不是所有的差异。
似乎这些差异以某种方式被psych::principal
加剧了。
(我有一个简单的理论原因,我希望得到证实)。
psych::principal
这对于浮点人工制品来说似乎太大了。
我的假设是(额外的)差异源于 library(GPArotation)
library(psych)
data("Thurstone")
principal.unrotated <- principal(r = Thurstone, nfactors = 4, rotate = "none") # find unrotated PCs first
loa <- unclass(principal.unrotated$loadings)
# just to compare that the rot.mat is correct
varimax.stats <- stats::varimax(x = loa, normalize = TRUE)
varimax.GPA <- GPArotation::Varimax(L = loa, normalize = TRUE)
# notice we're here NOT interested in the difference between stats and GPA, that's the other question
diff.from.rot.meth <- unclass(varimax.stats$loadings - varimax.GPA$loadings) # very small differences, see this question: https://stackoverflow.com/questions/32350891/why-are-there-differences-between-gparotationvarimax-and-statsvarimax
mean(abs(diff.from.rot.meth))
#> [1] 8.036863e-05
principal.varimax.stats <- principal(r = Thurstone, nfactors = 4, rotate = "varimax")
principal.Varimax.GPA <- principal(r = Thurstone, nfactors = 4, rotate = "Varimax")
diff.from.princ <- principal.Varimax.GPA$rot.mat - principal.varimax.stats$rot.mat # quite a substantial change, because Theta is NOT a rotmat, that makes sense
mean(abs(diff.from.princ))
#> [1] 0.021233
mean(abs(diff.from.rot.meth)) - mean(abs(diff.from.princ)) # principal has MUCH bigger differences
#> [1] -0.02115263
默认为(Kaiser)GPArotation::Varimax
,而** normalize == FALSE
默认为(Kaiser) stats::varimax
,不能在`principal :: psych``中设置不同。
normalize == TRUE
手册:
stats::varimax
## varimax with normalize = TRUE is the default
/ GPArotation::Varimax
手册:
参数normalize指示在旋转之前是否以及如何进行任何归一化,然后在旋转之后撤消归一化。如果normalize为FALSE(默认值),则不进行规范化。如果normalize为TRUE,则完成Kaiser归一化。 (因此将归一化A和的行条目平方为1.0。这有时称为Horst归一化。)
此外,他们GPArotation::GPForth
手动警告:
GPArotation包没有(默认情况下)规范化,fa功能也没有。然后,为了使其更加混乱,统计数据中的varimax确实如此,GPArotation中的Varimax不会。
任何人都可以确认差异实际上是由规范化选项来解释的吗?
答案 0 :(得分:1)
由于程序的准确性不同,会出现不同加载的问题(当然,因为在mental :: principal中normalize
选项未被评估,所有其他过程必须与切换选项一起使用为TRUE)。虽然可以配置stats :: varimax和GPArotation :: Varimax的精度(参数eps
),但在mental :: principal中忽略了这一点,并且似乎与stats :: varimax隐含相等,eps = 1e-5 。
如果我们将stats :: varimax和GPArotations :: Varimax的准确度提高到eps=1e-15
那么我们得到的结果相同(至少8位数),就像我在MatMate程序中的同意实现一样经过测试,它也非常准确地等于SPSS计算。
在psych :: principal中缺少对explicite选项eps
的处理似乎是一个错误,其糟糕的隐含默认值肯定是不满意的。
有趣的是,GPArotation :: Varimax需要与eps=1e-15
进行很多轮换,最后看输出;所以要么实现了另一个内部过程,要么在可以停止迭代时在决策中对eps参数进行不同的评估。一个例子,请参见本答案的最后,可能会提出这样的结果。
低于比较协议;仅显示加载的前两行。
The first three computations with low accuracy (eps=1e-5)
all procedures give equal or comparable results,
except GPArotation, which is already near the exact answer
--------------------------------
> principal(r = Thurstone, nfactors = 4, rotate = "varimax")$loadings *1000000
> stats::varimax(x = loa, normalize = TRUE,eps=1e-5)$loadings *1000000
> GPArotation::Varimax(L = loa, normalize = TRUE,eps=1e-5)$loadings *1000000
The second three computations with (attempted) high accuracy (eps=1e-15)
all procedures except psych::principal give equal results, and they
agree also with external crosscheck using MatMate
--------------------------------
> principal(r = Thurstone, nfactors = 4, rotate = "varimax",eps=1e-15)$loadings*1000000
> stats::varimax(x = loa, normalize = TRUE,eps=1e-15)$loadings*1000000
> GPArotation::Varimax(L = loa, normalize = TRUE,eps=1e-10)$loadings*1000000
尝试使用大eps
或没有
# ===== Numerical documentation (only first two rows are displayed)================
> principal(r = Thurstone, nfactors = 4, rotate = "varimax")$loadings*1000000
RC1 RC2 RC3 RC4
Sentences 871655.72 216638.46 198427.07 175202.57
Vocabulary 855609.28 294166.99 153181.45 180525.99
> stats::varimax(x = loa, normalize = TRUE,eps=1e-5)$loadings *1000000
PC1 PC2 PC3 PC4
Sentences 871655.72 216638.46 198427.07 175202.57
Vocabulary 855609.28 294166.99 153181.45 180525.99
> GPArotation::Varimax(L = loa, normalize = TRUE,eps=1e-5)$loadings *1000000
PC1 PC2 PC3 PC4
Sentences 871717.3 216618.7 198176.3 175204.47
Vocabulary 855663.1 294146.3 152930.7 180517.21
# =============================================================================
现在尝试使用较小的eps
> principal(r = Thurstone, nfactors = 4, rotate = "varimax",eps=1e-15)$loadings *1000000
RC1 RC2 RC3 RC4
Sentences 871655.72 216638.46 198427.07 175202.57
Vocabulary 855609.28 294166.99 153181.45 180525.99
> stats::varimax(x = loa, normalize = TRUE,eps=1e-15)$loadings *1000000
PC1 PC2 PC3 PC4
Sentences 871716.83 216619.69 198174.31 175207.86
Vocabulary 855662.58 294147.47 152928.77 180519.37
> GPArotation::Varimax(L = loa, normalize = TRUE,eps=1e-10)$loadings *1000000
PC1 PC2 PC3 PC4
Sentences 871716.8 216619.7 198174.31 175207.86
Vocabulary 855662.6 294147.5 152928.77 180519.37
Warnmeldung:
In GPForth(L, Tmat = Tmat, method = "varimax", normalize = normalize, :
convergence not obtained in GPForth. 1000 iterations used.
# Result by MatMate: --------------------------------------------------------
lad = cholesky(Thurstone)
pc = rot(lad,"pca")
pc4 = pc[*,1..4] // arrive at the first four pc's
t = gettrans( normzl(pc4),"varimax") // get rotation-matrix for row-normalized pc's
vmx = pc4 * t // rotate pc4 by rotation-matrix
display = vmx * 1000000
PC1 PC2 PC3 PC4
Sentences 871716.83 216619.68 198174.31 175207.87
Vocabulary 855662.58 294147.46 152928.77 180519.37
# ===============================================================================>
stats :: varimax和GPArotation :: Varimax的结果匹配更好,可以通过在GP :: 1中将stat设置为1e-12,在GPArotation中设置为1e-6,其中一个值是其他。我们得到
> GPArotation::Varimax(L = loa, normalize = TRUE,eps=1e-6)$loadings*1000000
PC1 PC2 PC3 PC4
Sentences 871716.8 216619.8 198174.49 175207.63
Vocabulary 855662.5 294147.6 152928.94 180519.21
> stats::varimax(x = loa, normalize = TRUE,eps=1e-12)$loadings*1000000
PC1 PC2 PC3 PC4
Sentences 871716.80 216619.74 198174.40 175207.85
Vocabulary 855662.55 294147.52 152928.86 180519.36
答案 1 :(得分:0)
这似乎证实,应用psych::kaiser
(我认为是为此目的而构建)会将差异缩小回stats::varimax
和GPArotation::Varimax
之间的原始差异:< / p>
principal.Varimax.GPA.kaiser <- kaiser(f = principal.unrotated, rotate = "Varimax")
diff.statsvari.gpavar.bothkaiser <- unclass(principal.Varimax.GPA.kaiser$loadings - principal.varimax.stats$loadings)
mean(abs(diff.statsvari.gpavar.bothkaiser))
#> [1] 8.036863e-05
这几乎是相同的结果,所以我认为假设已经确认。
psych::principal
产生的较大差异是因为normalize
的默认值不同。
<强>更新强>
对于相应的旋转矩阵(或任何Th
),差异也(再次)小得多:
principal.Varimax.GPA.kaiser$Th - principal.varimax.stats$rot.mat # those differences are very small now, too
#> [,1] [,2] [,3] [,4]
#> [1,] 1.380279e-04 1.380042e-05 -2.214319e-04 -2.279170e-06
#> [2,] 9.631517e-05 2.391296e-05 1.531723e-04 -3.371868e-05
#> [3,] 1.758299e-04 7.917460e-05 6.788867e-05 1.099072e-04
#> [4,] -9.548010e-05 6.500162e-05 -1.679753e-05 -5.213475e-05