我正在尝试使用libsvm和MATLAB来评估一对一的SVM,唯一的问题是我的数据集不够大,无法保证选择一个特定的测试集。因此,我想用留一法来评估我的分类器。
我对使用SVM并不是特别有经验,所以请原谅我,如果我对该怎么做有点困惑。我需要为我的分类器生成精确vs召回曲线和混淆矩阵,但我不知道从哪里开始。
我已经试了一下,并提出以下作为开始训练的粗略开始,但我不确定如何进行评估。
function model = do_leave_one_out(labels, data)
acc = [];
bestC = [];
bestG = [];
for ii = 1:length(data)
% Training data for this iteration
trainData = data;
trainData(ii) = [];
looLabel = labels(ii);
trainLabels = labels;
trainLabels(ii) = [];
% Do grid search to find the best parameters?
acc(ii) = bestReportedAccuracy;
bestC(ii) = bestValueForC;
bestG(ii) = bestValueForG;
end
% After this I am not sure how to train and evaluate the final model
end
答案 0 :(得分:4)
我正在尝试提供您可能感兴趣的一些模块,您可以将它们合并到您的函数中。希望它有所帮助。
<强>留一法:强>
scrambledList = randperm(totalNumberOfData);
trainingData = Data(scrambledList(1:end-1),:);
trainingLabel = Label(scrambledList(1:end-1));
testData = Data(scrambledList(end),:);
testLabel = Label(scrambledList(end));
网格搜索(双类案例):
acc = 0;
for log2c = -1:3,
for log2g = -4:1,
cmd = ['-v 5 -c ', num2str(2^log2c), ' -g ', num2str(2^log2g)];
cv = svmtrain(trainingLabel, trainingData, cmd);
if (cv >= acc),
acc = cv; bestC = 2^log2c; bestG = 2^log2g;
end
end
end
One-vs-all(用于多类案例):
model = cell(NumofClass,1);
for k = 1:NumofClass
model{k} = svmtrain(double(trainingLabel==k), trainingData, '-c 1 -g 0.2 -b 1');
end
%% calculate the probability of different labels
pr = zeros(1,NumofClass);
for k = 1:NumofClass
[~,~,p] = svmpredict(double(testLabel==k), testData, model{k}, '-b 1');
pr(:,k) = p(:,model{k}.Label==1); %# probability of class==k
end
%% your label prediction will be the one with highest probability:
[~,predctedLabel] = max(pr,[],2);