如何用离开交叉验证来计算一对一分类器的推广?

时间:2016-01-19 10:57:00

标签: matlab machine-learning statistics pattern-recognition

作为关于机器学习问题的基本问题,我想知道如何使用此描述来计算分类器的泛化:

  1. 数据集属性:

    • 样本数:220
    • 功能数量:19
    • 班级数:7
  2. 分类方法:

    • 支持矢量机。
    • 就像你现在一样,我的分类任务与将一个样本分为7个类有关,所以我从一对一的方法中使用,然后利用多数投票的方法。 [分类程序:我创建了21个分类器,然后对于每个样本我分别用21个分类器进行分类。直到这一步,我得到了21个类别标签(1,2,3,4,5,6,7)。然后,我进入下一步并计算每个标签的数量(这称为多数投票),例如输入x:#Class Label ONE = 7,#TWO = 5,#THREE = 4,#FOUR = 1, #FIVE = 1,#SIX = 1,#SEVEN = 2.]。我会将每个样本分配给在Majority-Process中具有更高学位的任何课程。
  3. 因此,使用此说明,如何分析我的分类器重新归结为泛化问题。

    我欢迎任何可以帮助我的MATLAB代码。

    load('dataset.mat');
    
    X_train = data_train;
    X_test = data_test;
    
    [train_size, feature_size, class_size] = size(X_train);
    [test_size] = size(X_test,1);
    
    x = [];
    for index=1:class_size
    x=[x;X_test(:,:,index)];
    end
    X_test = x;
    
    correct_label = [];
    for index=1:class_size
         correct_label = [correct_label;index*ones(test_size,1)];
    end
    
    g = [];
    C = 1;
    for i=class_size:-1:1
        for j=1:(i-1)
            train = [X_train(:,:,j);X_train(:,:,i)];
            y = [j*ones(train_size,1);i*ones(train_size,1)];      
            svmModel = fitcsvm(train, y,'Standardize',true,'KernelFunction','GHI_Kernel');
            group = predict(svmModel, X_test);
            g = [g,group];
        end
    end
    temp = zeros(size(X_test,1),class_size);
    n = size(X_test,1);
    for i=1:n
         for j=1:class_size
              temp(i,j) = sum(((g(i,:)==j)'));
         end
    end
    %% esm1 contains the predicted class label
    %% val1 contains the number of voting for predicted class label
    [val1, esm1] = max(temp,[],2);
    
    %% Calculating Confidence Matrix
    for idx=1:n
         temp(idx,esm1(idx))=0;
    end
    [val2, esm2] = max(temp,[],2);
    confidence_matrix = zeros(class_size, class_size);
    counter = zeros(class_size, class_size);
    
    for idx=1:n
         confidence_matrix(esm1(idx), correct_label(idx))...
              = confidence_matrix(esm1(idx), correct_label(idx)) + ...
              (val1(idx) - val2(idx))/val1(idx);
         counter(esm1(idx), correct_label(idx)) = ...
              counter(esm1(idx), correct_label(idx)) + 1;
    end
    
    for idx=1:class_size
        confidence_matrix(:,idx) = confidence_matrix(:,idx)/sum(confidence_matrix(:,idx));
    end
    
    CCR = sum(esm1 == correct_label)/size(correct_label,1);
    plot_confusion_matrix(esm1, correct_label);
    error = 1 - CCR;
    confusion_matrix = confusionmat(esm1, correct_label);  
    

0 个答案:

没有答案