假设我有两个PCA加载矩阵loa.orig
和loa.rot
,我知道loa.rot
是loa.orig
的轮换(手动或其他)。
(loa.orig
也可能已经通过varimax或其他东西进行正交旋转,但我认为不重要。)
我知道想知道loa.orig
已被轮换到loa.rot
的角度。
我从this comment on another question了解到“旋转不通勤”,因此,成对(平面)旋转的顺序也很重要。
所以从loa.rot
重现loa.orig
我需要知道一系列必要的轮换,最好按照下面rots
给出的顺序。
这是一个MWE:
library(psych) # this allows manual rotation
# suppose I have some ORIGINAL loadings matrix, from a principal components analysis, with three retained components
loa.orig <- cbind(c(0.6101496, 0.7114088, 0.3356003, 0.7318809, 0.5980133, 0.4102817, 0.7059148, 0.6080662, 0.5089014, 0.587025, 0.6166816, 0.6728603, 0.7482675, 0.5409658, 0.6415472, 0.3655053, 0.6313868), c(-0.205317, 0.3273207, 0.7551585, -0.1981179, -0.423377, -0.07281187, -0.04180098, 0.5003459, -0.504371, 0.1942334, -0.3285095, 0.5221494, 0.1850734, -0.2993066, -0.08715662, -0.02191772, -0.2002428), c(-0.4692407, 0.1581682, -0.04574932, -0.1189175, 0.2449018, -0.5283772, 0.02826476, 0.1703277, 0.2305158, 0.2135566, -0.2783354, -0.05187637, -0.104919, 0.5054129, -0.2403471, 0.5380329, -0.07999642))
# I then rotate 1-2 by 90°, and 1-3 by 45°
loa.rot <- factor.rotate(f = loa.orig, angle = 90, col1 = 1, col2 = 2)
loa.rot <- factor.rotate(f = loa.rot, angle = 45, col1 = 1, col2 = 3)
# predictably, loa.rot and loa.orig are now different
any(loa.rot == loa.orig) # are any of them the same?
显然,在这种情况下,我知道角度和顺序,但我们假设我没有。 另外,我们假设在实际用例中可能会保留和旋转许多组件,而不仅仅是三个。
我有点不确定报告组件对(平面)旋转角度的顺序的常规方法,但我想象的列表可能的组合(~~不是排列~~)应该这样做。
combs <- combn(x = ncol(loa.orig), m = 2, simplify = TRUE) # find all possible combinations of factors
rots <- data.frame(t(combs), stringsAsFactors = FALSE) # transpose
rots # these rows give the *order* in which the rotations should be done
rots
给出了这些排列。
很高兴知道如何从loa.rot
loa.orig
到达rots
,旋转varimax
中行所给出的组件对。
更新:根据以下答案尝试
根据以下答案,我尝试将一个函数组合在一起,并使用varimax
旋转和真实数据集对其进行测试。
(library(psych)
data("Harman74.cor") # notice the correlation matrix is called "cov", but doc says its a cor matrix
vanilla <- principal(r = Harman74.cor$cov, nfactors = 4, rotate = "none", )$loadings # this is unrotated
class(vanilla) <- NULL # print methods only causes confusion
varimax <- principal(r = Harman74.cor$cov, nfactors = 4, rotate = "varimax")$loadings # this is rotated
class(varimax) <- NULL # print methods only causes confusion
find.rot.instr <- function(original, rotated) {
# original <- vanilla$loadings # testing
# rotated <- varimax$loadings # testing
getAngle <- function(A, B) acos(sum(A*B) / (norm(A, "F") * norm(B, "F"))) * 180/pi
rots <- combn(x = ncol(original), m = 2, simplify = FALSE) # find all possible combinations of factor pairs
tmp <- original
angles <- sapply(rots, function(cols) {
angle <- getAngle(tmp[, cols], rotated[, cols])
tmp <<- factor.rotate(tmp, angle = angle, col1 = cols[1], col2 = cols[2])
return(angle)
})
return(angles)
}
vanilla.to.varimax.instr <- find.rot.instr(original = vanilla, rotated = varimax) # these are the angles we would need to transform in this order
rots <- combn(x = ncol(vanilla), m = 2, simplify = FALSE) # find all possible combinations of factor pairs
# this is again, because above is in function
# now let's implement the extracted "recipe"
varimax.recreated <- vanilla # start with original loadings
varimax.recreated == vanilla # confirm that it IS the same
for (i in 1:length(rots)) { # loop over all combinations, starting from the top
varimax.recreated <- factor.rotate(f = varimax.recreated, angle = vanilla.to.varimax.instr[i], col1 = rots[[i]][1], col2 = rots[[i]][2])
}
varimax == varimax.recreated # test whether they are the same
varimax - varimax.recreated # are the close?
没有特别的理由 - 我只是想要一些我们实际上不知道角度的旋转。)。
然后我测试是否可以使用提取的角度从香草载荷中重新创建varimax旋转。
> varimax == varimax.recreated # test whether they are the same
PC1 PC3 PC2 PC4
VisualPerception FALSE FALSE FALSE FALSE
Cubes FALSE FALSE FALSE FALSE
PaperFormBoard FALSE FALSE FALSE FALSE
Flags FALSE FALSE FALSE FALSE
GeneralInformation FALSE FALSE FALSE FALSE
PargraphComprehension FALSE FALSE FALSE FALSE
SentenceCompletion FALSE FALSE FALSE FALSE
WordClassification FALSE FALSE FALSE FALSE
WordMeaning FALSE FALSE FALSE FALSE
Addition FALSE FALSE FALSE FALSE
Code FALSE FALSE FALSE FALSE
CountingDots FALSE FALSE FALSE FALSE
StraightCurvedCapitals FALSE FALSE FALSE FALSE
WordRecognition FALSE FALSE FALSE FALSE
NumberRecognition FALSE FALSE FALSE FALSE
FigureRecognition FALSE FALSE FALSE FALSE
ObjectNumber FALSE FALSE FALSE FALSE
NumberFigure FALSE FALSE FALSE FALSE
FigureWord FALSE FALSE FALSE FALSE
Deduction FALSE FALSE FALSE FALSE
NumericalPuzzles FALSE FALSE FALSE FALSE
ProblemReasoning FALSE FALSE FALSE FALSE
SeriesCompletion FALSE FALSE FALSE FALSE
ArithmeticProblems FALSE FALSE FALSE FALSE
> varimax - varimax.recreated # are the close?
PC1 PC3 PC2 PC4
VisualPerception 0.2975463 1.06789735 0.467850675 0.7740766
Cubes 0.2317711 0.91086618 0.361004861 0.4366521
PaperFormBoard 0.1840995 0.98694002 0.369663215 0.5496151
Flags 0.4158185 0.82820078 0.439876777 0.5312143
GeneralInformation 0.8807097 -0.33385999 0.428455899 0.7537385
PargraphComprehension 0.7604679 -0.30162120 0.389727192 0.8329341
SentenceCompletion 0.9682664 -0.39302764 0.445263121 0.6673116
WordClassification 0.7714312 0.03747430 0.460461099 0.7643221
WordMeaning 0.8010876 -0.35125832 0.396077591 0.8201986
Addition 0.4236932 -0.32573100 0.204307400 0.6380764
Code 0.1654224 -0.01757153 0.194533996 0.9777764
CountingDots 0.3585004 0.28032822 0.301148474 0.5929926
StraightCurvedCapitals 0.5313385 0.55251701 0.452293566 0.6859854
WordRecognition -0.3157408 -0.13019630 -0.034647588 1.1235253
NumberRecognition -0.4221889 0.10729098 -0.035324356 1.0963785
FigureRecognition -0.3213392 0.76012989 0.158748259 1.1327322
ObjectNumber -0.3234966 -0.02363732 -0.007830001 1.1804147
NumberFigure -0.2033601 0.59238705 0.170467459 1.0831672
FigureWord -0.0788080 0.35303097 0.154132395 0.9097971
Deduction 0.3423495 0.41210812 0.363022937 0.9181519
NumericalPuzzles 0.3573858 0.57718626 0.393958036 0.8206304
ProblemReasoning 0.3430690 0.39082641 0.358095577 0.9133117
SeriesCompletion 0.4933886 0.56821932 0.465602192 0.9062039
ArithmeticProblems 0.4835965 -0.03474482 0.332889805 0.9364874
不幸的是,它们不一样,甚至不相似:(
string.IsNullOrEmpty(row.Cells[clm.Index].Value.ToString())
很明显,我犯了一个错误。
答案 0 :(得分:2)
我现在有一种方法可以找到旋转矩阵的任意数量维度的欧拉角的模拟(尽管随着尺寸的增加,它变得越来越计算密集)。此方法适用于样本数据集和varimax,适用于2到6个因子。我没有测试超过6.对于5和6因素,似乎第5列增加了反射 - 我不确定是什么决定哪些列被反射,因此这在目前的例子中是硬编码的。
该方法与以前一样,使用lm.fit
生成旋转矩阵。然后使用yacas
使用符号矩阵乘法计算复合旋转矩阵,以便在适当数量的维度中进行任意旋转。然后迭代地解决角度。矩阵中总有一个元素基于一个角度的sin
,然后可以用来迭代计算其他角度的值。
输出包括输入中实际不同的列子集,使用线性模型生成的复合旋转/反射矩阵,根据角度和列的单独旋转,计算的复合旋转矩阵,反射/列某些示例所需的交换矩阵,以及计算的旋转和输入之间差异的平方和(对于所有示例,这是1e-20
的顺序)。
与我原来的解决方案不同,这只提供了一种可能的旋转顺序组合。对于factorial(n * (n-1) / 2)
维度,实际可能性数量(即使仅相当于Tait-Bryan angles为n
,对于6维,约为1.3e12
。
library("psych")
library("magrittr")
library("stringr")
library("Ryacas")
rot_mat_nd <- function(dimensions, composite_var = NULL,
rot_order = combn(dimensions, 2, simplify = FALSE)) {
d <- diag(dimensions)
storage.mode(d) <- "character"
mats <- lapply(seq(rot_order), function(i) {
l <- paste0("a", i)
cmb <- rot_order[[i]]
d[cmb[1], cmb[1]] <- sprintf("cos(%s)", l)
d[cmb[1], cmb[2]] <- sprintf("-sin(%s)", l)
d[cmb[2], cmb[1]] <- sprintf("sin(%s)", l)
d[cmb[2], cmb[2]] <- sprintf("cos(%s)", l)
paste0("{{",
paste(apply(d, 1, paste, collapse = ","), collapse = "},{"),
"}}")
})
yac_statement <- paste0("Simplify(", paste(mats, collapse = "*"), ")")
if (!is.null(composite_var)) {
yac_statement <- paste0(composite_var, " := ", yac_statement)
}
output <- yacas(yac_statement)
list(mats = mats, composite = output, rot_order = rot_order)
}
find_angles_nd <- function(input, rotated, reflect = NULL) {
matched_cols <- sapply(1:ncol(input), function(i)
isTRUE(all.equal(input[, i], rotated[, i])))
dimensions <- sum(!matched_cols)
theor_rot <- rot_mat_nd(dimensions, "r")
rv <- yacas("rv := Concat @ r")
swap_mat <- matrix(0, dimensions, dimensions)
swap_mat[cbind(1:dimensions, match(colnames(input), colnames(rotated)))] <- 1
if (!is.null(reflect)) {
swap_mat[, reflect] <- -swap_mat[, reflect]
}
input_changed <- input[, !matched_cols]
rotated_changed <- rotated[, !matched_cols]
rotated_swapped <- rotated_changed %*% swap_mat
rot_mat <- lm.fit(input_changed, rotated_swapped)$coef
rot_mat_v <- c(t(rot_mat))
known_angles <- numeric()
angles_to_find <- nrow(rot_mat) * (nrow(rot_mat) - 1) / 2
iterations <- 0L
angles_found <- -1
while(length(known_angles) < angles_to_find & angles_found != 0) {
iterations <- iterations + 1L
message(sprintf("Iteration %d; angles remaining at start %d",
iterations, angles_to_find - length(known_angles)))
yacas(sprintf("rvwv := WithValue({%s}, {%s}, rv)",
paste(names(known_angles), collapse = ", "),
paste(known_angles, collapse = ", ")
))
var_num <- yacas("MapSingle(Length, MapSingle(VarList, rvwv))") %>%
as.expression %>%
eval %>%
unlist
angles_found <- 0L
for (i in which(var_num == 1)) {
var <- as.character(yacas(sprintf("VarList(rvwv[%d])[1]", i)))
if (!(var %in% names(known_angles))) {
to_solve <- as.character(yacas(sprintf("rvwv[%d]", i)))
fun_var <- str_extract(to_solve, sprintf("(sin|cos)\\(%s\\)", var))
fun_c <- substr(fun_var, 1, 3)
if (fun_c == "sin") {
to_solve_mod <- str_replace(to_solve, fixed(fun_var), "x")
solved <- as.character(yacas(sprintf("Solve(%s == %0.15f, x)[1]", to_solve_mod, rot_mat_v[i])))
answer <- asin(eval(parse(text = str_replace(solved, "x == ", ""))))
known_angles <- c(known_angles, setNames(answer, var))
angles_found <- angles_found + 1L
}
}
}
message(sprintf("- found %d", angles_found))
}
calc_rot_mat <-
matrix(unlist(simplify2array(eval(
as.expression(theor_rot$composite),
as.list(known_angles)
))), dimensions, byrow = TRUE)
ssd <- sum((input_changed %*% calc_rot_mat %*% swap_mat - rotated_changed) ^ 2)
angles <- known_angles[paste0("a", 1:angles_to_find)] / pi * 180
list(rot_mat = rot_mat, calc_rot_mat = calc_rot_mat, swap_mat = swap_mat, angles = angles,
rot_order = theor_rot$rot_order, input_changed = input_changed,
rotated_changed = rotated_changed, sum_square_diffs = ssd)
}
factor_rotate_multiple <- function(input_changed, angles, rot_order, swap_mat) {
rotated <- input_changed
for (i in 1:length(angles)) {
rotated <- factor.rotate(rotated, angles[i], rot_order[[i]][1], rot_order[[i]][2])
}
rotated %*% swap_mat
}
2-6维度的示例
data("Harman74.cor") # notice the correlation matrix is called "cov", but doc says its a cor matrix
example_nd <- function(dimensions, reflect = NULL) {
find_angles_nd(
unclass(principal(r = Harman74.cor$cov, nfactors = dimensions, rotate = "none")$loadings),
unclass(principal(r = Harman74.cor$cov, nfactors = dimensions, rotate = "varimax")$loadings),
reflect = reflect
)
}
angles_2d <- example_nd(2)
angles_2d[c("angles", "rot_order", "sum_square_diffs")]
# shows the resultant angle in degrees, rotation order and the
# sum of the squares of the differences between the provided and calculated
# rotations
#$angles
# a1
#-39.88672
#
#$rot_order
#$rot_order[[1]]
#[1] 1 2
#
#
#$sum_square_diffs
#[1] 8.704914e-20
angles_3d <- example_nd(2)
angles_3d[c("angles", "rot_order", "sum_square_diffs")]
#$angles
# a1 a2 a3
#-45.19881 -29.77423 -17.07210
#
#$rot_order
#$rot_order[[1]]
#[1] 1 2
#
#$rot_order[[2]]
#[1] 1 3
#
#$rot_order[[3]]
#[1] 2 3
#
#
#$sum_square_diffs
#[1] 7.498253e-20
angles_4d <- example_nd(2)
angles_5d <- example_nd(2, reflect = 5)
angles_6d <- example_nd(2, reflect = 5)
这个问题可以分为两个。第一部分是计算将输入与输出相关联的复合旋转矩阵。即在等式A %*% B == C
中,从B
和A
计算C
。对于方形矩阵,可以使用solve
来完成。然而,在这种情况下,行&gt;列最简单的方法是使用线性模型和lm.fit
函数。
问题的第二部分是识别为了产生复合旋转矩阵而执行的旋转。存在无限多种可能的组合,但是通常,就围绕三个轴的一系列旋转(即使用欧拉角)工作(对于3列)是合理的。即使这样,轮换也有六种可能的顺序。对于两列,问题是微不足道的,因为只需要一次旋转,并且单个asin
或acos
足以识别角度。对于超过3列,可能generalise the Euler angle method但更复杂。
这是R中使用线性模型找到复合旋转矩阵和RSpincalc
包以找到欧拉角的完整方法。它假设旋转影响了3列,如给出的例子。
library("RSpincalc")
library("combinat")
find_rotations <- function(input, rotated) {
matched_cols <- sapply(1:ncol(input), function(i) isTRUE(all.equal(input[, i], rotated[, i])))
if (sum(!matched_cols) != 3) {
stop("This method only works for rotations affecting 3 columns.")
}
rot_mat <- lm.fit(input[, !matched_cols], rotated[, !matched_cols])$coef
rot_poss <- as.data.frame(do.call("rbind", permn(c("z", "y", "x"))), stringsAsFactors = FALSE)
rot_poss$axes_for_EA <- apply(rot_poss, 1, function(x) paste(rev(x), collapse = ""))
combo_cols <- as.data.frame(matrix(which(!matched_cols)[combn(3, 2)], 2))
rot_poss[5:10] <- do.call("rbind", permn(combo_cols, unlist))
names(rot_poss)[c(1:3, 5:10)] <- c(paste0("axis", 1:3),
paste0("rot", rep(1:3, each = 2), "_c", rep(1:2, 3)))
rot_poss[paste0("angle", 1:3)] <- t(round(sapply(rot_poss$axes, DCM2EA, DCM = rot_mat), 14)) * 180 / pi
rot_poss[paste0("angle", 1:3)] <-
lapply(1:3, function(i) {
ifelse(rot_poss[, paste0("axis", i)] == "y", 1, -1) *
rot_poss[, paste0("angle", 4 - i)]
})
rot_poss
}
对于OP的数据:
rot_poss <- find_rotations(loa.orig, loa.rot)
另一个更全面的演示:
set.seed(123)
input <- matrix(rnorm(65), 13)
library("magrittr")
rotated <- input %>%
factor.rotate(33.5, 1, 3) %>%
factor.rotate(63, 2, 3) %>%
factor.rotate(-3, 1, 2)
rot_poss <- find_rotations(input, rotated)
rot_poss
# axis1 axis2 axis3 axes_for_EA rot1_c1 rot1_c2 rot2_c1 rot2_c2 rot3_c1 rot3_c2
#1 z y x xyz 1 2 1 3 2 3
#2 z x y yxz 1 2 2 3 1 3
#3 x z y yzx 2 3 1 2 1 3
#4 x y z zyx 2 3 1 3 1 2
#5 y x z zxy 1 3 2 3 1 2
#6 y z x xzy 1 3 1 2 2 3
# angle1 angle2 angle3
#1 -1.585361 30.816825 63.84410
#2 44.624426 50.431683 53.53631
#3 59.538985 26.581047 16.27152
#4 66.980038 14.511490 27.52974
#5 33.500000 63.000000 -3.00000
#6 30.826477 -1.361477 63.03177
possible_calls <- do.call("sprintf", c(list(fmt = "input %%>%%
factor.rotate(%1$g, %4$g, %5$g) %%>%%
factor.rotate(%2$g, %6$g, %7$g) %%>%%
factor.rotate(%3$g, %8$g, %9$g)"),
rot_poss[c(paste0("angle", 1:3),
grep("^rot", names(rot_poss), value = TRUE))]))
cat(possible_calls, sep = "\n\n")
#input %>%
#factor.rotate(-1.58536, 1, 2) %>%
#factor.rotate(30.8168, 1, 3) %>%
#factor.rotate(63.8441, 2, 3)
#input %>%
#factor.rotate(44.6244, 1, 2) %>%
#factor.rotate(50.4317, 2, 3) %>%
#factor.rotate(53.5363, 1, 3)
#input %>%
#factor.rotate(59.539, 2, 3) %>%
#factor.rotate(26.581, 1, 2) %>%
#factor.rotate(16.2715, 1, 3)
#input %>%
#factor.rotate(66.98, 2, 3) %>%
#factor.rotate(14.5115, 1, 3) %>%
#factor.rotate(27.5297, 1, 2)
#input %>%
#factor.rotate(33.5, 1, 3) %>%
#factor.rotate(63, 2, 3) %>%
#factor.rotate(-3, 1, 2)
#input %>%
#factor.rotate(30.8265, 1, 3) %>%
#factor.rotate(-1.36148, 1, 2) %>%
#factor.rotate(63.0318, 2, 3)
lapply(possible_calls, function(cl) all.equal(eval(parse(text = cl)), rotated, tolerance = 1e-6))
# tolerance reduced because above calls have rounding to 6 significant figures
注意最后一位将使用可以重现旋转的magrittr
管道输出调用。另请注意,每个角度有6种可能的旋转顺序。
为了使角度匹配,我不得不在y旋转时翻转标志。我也不得不翻转轮换顺序。
对于varimax
更新,我的方法适用于nfactor = 3
,但需要调整为4或更高。
答案 1 :(得分:1)
我认为如果你进行矩阵代数问题会更容易(我正在做正交旋转,而不是倾斜变换,但逻辑是一样的。)
首先,注意任何旋转t都会产生一个旋转的分量矩阵c,使c = Ct,其中C是未旋转的PCA解。
然后,由于C'C是原始相关矩阵R,我们可以通过将两边乘以C'然后乘以R的倒数来求解t。这导致
t = C'R ^ -1 c
对于您的示例,请
R <- Harman74.cor$cov
P <- principal(Harman74.cor$cov,nfactors=4,rotate="none")
p <- principal(Harman74.cor$cov,nfactors=4) #the default is to do varimax
rotation <- t(P$loadings) %*% solve(R) %*% p$loadings
然后,看看这是否正确
P$loadings %*% rotation - p$loadings
此外,即将发布的psych现在会报告旋转矩阵,因此我们可以将旋转与p $ rot.mat进行比较
round(p$rot.mat,2)
round(rotation,2)
这些组件的顺序不同,但这是因为心理在旋转后对组件进行重新排序以反映旋转组件的平方和的大小。