我无法理解如何在SPSS中执行LOOCV。 我需要评估一个简单的线性回归 $ Y = AX + B $。 感谢。
答案 0 :(得分:1)
对于线性回归,它是pretty easy,SPSS允许您在REGRESSION
命令中保存统计信息。请参阅here for another example。
REGRESSION
/NOORIGIN
/DEPENDENT Y
/METHOD=ENTER X
/SAVE PRED (PredAll) DFIT (CVFit).
然后,留出一个预测可以计算为COMPUTE LeaveOneOut = PredAll - CVFit.
但是对于非线性模型,SPSS不能为一个人提供方便的SAVE
值,可以构建具有缺失值的重复数据集,然后使用SPLIT FILE
,然后获取您想要的任何统计程序的统计数据。如果您的id变量只是数据集的行号,您只需要两个最大案例编号的循环,然后将所需的信息与新文件匹配。
以下是此过程的示例。
*Making some fake data to work with.
INPUT PROGRAM.
LOOP Id = 1 TO 10.
END CASE.
END LOOP.
END FILE.
END INPUT PROGRAM.
DATASET NAME Sim.
COMPUTE X = RV.NORMAL(10,5).
COMPUTE Y = 3 + 0.2*(X) + RV.NORMAL(0,0.2).
FORMATS Id (F2.0) X Y (F4.2).
EXECUTE.
*Original regression model with the leave one.
*out fits.
REGRESSION
/NOORIGIN
/DEPENDENT Y
/METHOD=ENTER X
/SAVE PRED (PredAll) DFIT (CVFit).
*Manual way to create stacked dataset
*can use with other non-linear models.
INPUT PROGRAM.
COMPUTE #Cases = 10.
LOOP #Id = 1 TO #Cases.
LOOP #Iter = 1 TO #Cases.
COMPUTE L1O = #Iter.
COMPUTE Id = #Id.
END CASE.
END LOOP.
END LOOP.
END FILE.
END INPUT PROGRAM.
DATASET NAME LeaveOneOut.
*Merging in original data.
MATCH FILES FILE = *
/TABLE = 'Sim'
/BY Id.
*Set missing to
IF L1O = Id Y = $SYSMIS.
SORT CASES BY L1O.
SPLIT FILE BY L1O.
*You can replace regression with whatever procedure you are.
*interested in.
REGRESSION
/NOORIGIN
/DEPENDENT Y
/METHOD=ENTER X
/SAVE PRED (CVFit2).
SPLIT FILE OFF.
*This shows the original leave one out stats.
*And new stats are the same besides some floating.
*point differences.
COMPUTE Test = (CVFit2 - (PredAll-CVFit)).
TEMPORARY.
SELECT IF (L1O = Id).
FREQ VAR Test.
EXECUTE.