我的问题类似于R: using predict() on new data with high dimensionality,但对于Stata
我想在一个数据子集(来自实验的控制组)上运行主成分模型(pca)来提取第一个组件。然后我想在一个单独的数据子集(实验的治疗组)上重新运行PCA模型,并获得这些数据的分数。基本上我想使用在dataset_1上运行的pca模型来预测新数据集_2中的分数。
在R中,只有模型适合控制组,然后才能在拟合模型上使用“预测”命令,并在“新数据”参数中设置完整数据。这将仅针对安装在对照组上的模型生成所有观察结果的预测。但是,如何在Stata中做到这一点?
global xlist2a std_agreedisagree1_1_a std_revagreedisagree1_2_a std_revagreedisagree1_3_a std_agreedisagree1_4_a std_revagreedisagree1_10_a std_revagreedisagree1_5_a
pca $xlist2a
screeplot, yline(1)
rotate, clear
pca $xlist2a, com(3)
rotate, varimax blanks (.30)
predict pca5_p1b pca5_p2b pca5_p3b, score
基于尼克回答的固定代码:
global xlist2a std_agreedisagree1_1_a std_revagreedisagree1_2_a std_revagreedisagree1_3_a std_agreedisagree1_4_a std_revagreedisagree1_10_a std_revagreedisagree1_5_a
pca $xlist2a if zgroupa10==1
screeplot, yline(1)
rotate, clear
pca $xlist2a if zgroupa10==1, com(3)
rotate, varimax blanks (.30)
predict pca5_p1b pca5_p2b pca5_p3b, score
答案 0 :(得分:0)
您尝试了哪些代码?最简单的实验表明,同样的方法也适用于Stata:
. sysuse auto, clear
(1978 Automobile Data)
. pca headroom trunk length displacement if foreign
Principal components/correlation Number of obs = 22
Number of comp. = 4
Trace = 4
Rotation: (unrotated = principal) Rho = 1.0000
--------------------------------------------------------------------------
Component | Eigenvalue Difference Proportion Cumulative
-------------+------------------------------------------------------------
Comp1 | 1.93666 .656823 0.4842 0.4842
Comp2 | 1.27983 .615381 0.3200 0.8041
Comp3 | .664453 .545396 0.1661 0.9702
Comp4 | .119057 . 0.0298 1.0000
--------------------------------------------------------------------------
Principal components (eigenvectors)
--------------------------------------------------------------------
Variable | Comp1 Comp2 Comp3 Comp4 | Unexplained
-------------+----------------------------------------+-------------
headroom | 0.0288 0.7373 0.6749 0.0083 | 0
trunk | 0.2443 0.6496 -0.7199 -0.0090 | 0
length | 0.6849 -0.1313 0.1229 -0.7061 | 0
displacement | 0.6858 -0.1313 0.1054 0.7080 | 0
--------------------------------------------------------------------
. predict score1 score2 if !foreign
(score assumed)
(2 components skipped)
Scoring coefficients
sum of squares(column-loading) = 1
------------------------------------------------------
Variable | Comp1 Comp2 Comp3 Comp4
-------------+----------------------------------------
headroom | 0.0288 0.7373 0.6749 0.0083
trunk | 0.2443 0.6496 -0.7199 -0.0090
length | 0.6849 -0.1313 0.1229 -0.7061
displacement | 0.6858 -0.1313 0.1054 0.7080
------------------------------------------------------
。