我试图在MATLAB中获得预测列矩阵,但我不太清楚如何编写它。我目前的代码是 -
load DataWorkspace.mat
groups = ismember(Num,'Yes');
k=10;
%# number of cross-validation folds:
%# If you have 50 samples, divide them into 10 groups of 5 samples each,
%# then train with 9 groups (45 samples) and test with 1 group (5 samples).
%# This is repeated ten times, with each group used exactly once as a test set.
%# Finally the 10 results from the folds are averaged to produce a single
%# performance estimation.
cvFolds = crossvalind('Kfold', groups, k);
cp = classperf(groups);
for i = 1:k
testIdx = (cvFolds == i);
trainIdx = ~testIdx;
svmModel = svmtrain(Data(trainIdx,:), groups(trainIdx), ...
'Autoscale',true, 'Showplot',false, 'Method','SMO', ...
'Kernel_Function','rbf');
pred = svmclassify(svmModel, Data(testIdx,:), 'Showplot',false);
%# evaluate and update performance object
cp = classperf(cp, pred, testIdx);
end
cp.CorrectRate
cp.CountingMatrix
问题在于它实际上总共计算了11次精确度 - 每次折叠10次,平均最后一次。但是,如果我对每个折叠进行单独的预测并为每个循环打印 pred ,那么可理解的准确性会大大降低。
但是,我需要每行数据的预测值的列矩阵。关于如何修改代码的任何想法?
答案 0 :(得分:1)
交叉验证的整体思路是对分类器的性能进行无偏估计。
完成后,您通常只需在整个数据上训练模型。该模型将用于预测未来的实例。
所以就这样做:
svmModel = svmtrain(Data, groups, ...);
pred = svmclassify(svmModel, otherData, ...);