libsvm:使用leave-one-out评估SVM

时间:2014-02-06 09:57:53

标签: matlab machine-learning classification svm libsvm

我正在尝试使用libsvm和MATLAB来评估一对一的SVM,唯一的问题是我的数据集不够大,无法保证选择一个特定的测试集。因此,我想用留一法来评估我的分类器。

我对使用SVM并不是特别有经验,所以请原谅我,如果我对该怎么做有点困惑。我需要为我的分类器生成精确vs召回曲线和混淆矩阵,但我不知道从哪里开始。

我已经试了一下,并提出以下作为开始训练的粗略开始,但我不确定如何进行评估。

function model = do_leave_one_out(labels, data)
             acc = [];
             bestC = [];
             bestG = [];
             for ii = 1:length(data)
                  % Training data for this iteration
                  trainData = data;
                  trainData(ii) = [];
                  looLabel = labels(ii);
                  trainLabels = labels;
                  trainLabels(ii) = [];

                  % Do grid search to find the best parameters?

                  acc(ii) = bestReportedAccuracy;
                  bestC(ii) = bestValueForC;
                  bestG(ii) = bestValueForG;
             end
             % After this I am not sure how to train and evaluate the final model
end

1 个答案:

答案 0 :(得分:4)

我正在尝试提供您可能感兴趣的一些模块,您可以将它们合并到您的函数中。希望它有所帮助。

<强>留一法:

scrambledList = randperm(totalNumberOfData);
trainingData = Data(scrambledList(1:end-1),:);
trainingLabel = Label(scrambledList(1:end-1));
testData = Data(scrambledList(end),:);
testLabel = Label(scrambledList(end));

网格搜索(双类案例):

acc = 0;
for log2c = -1:3,
  for log2g = -4:1,
    cmd = ['-v 5 -c ', num2str(2^log2c), ' -g ', num2str(2^log2g)];
    cv = svmtrain(trainingLabel, trainingData, cmd);
    if (cv >= acc),
      acc = cv; bestC = 2^log2c; bestG = 2^log2g;
    end    
  end
end

One-vs-all(用于多类案例):

model = cell(NumofClass,1);
for k = 1:NumofClass
    model{k} = svmtrain(double(trainingLabel==k), trainingData, '-c 1 -g 0.2 -b 1');
end

%% calculate the probability of different labels

pr = zeros(1,NumofClass);
for k = 1:NumofClass
    [~,~,p] = svmpredict(double(testLabel==k), testData, model{k}, '-b 1');
    pr(:,k) = p(:,model{k}.Label==1);    %# probability of class==k
end

%% your label prediction will be the one with highest probability:

[~,predctedLabel] = max(pr,[],2);