我正在尝试使用带有libsvm库的SVM-RFE来运行基因表达数据集。我的算法是用Matlab编写的。特定数据集能够在5倍CV下产生80 ++%的分类准确度,而无需应用特征选择。当我尝试在此数据集上应用svm-rfe(相同的svm参数设置并使用5倍CV)时,分类结果变得更糟,只能达到60 ++%的分类准确度。
这是我的matlab编码,感谢任何人都可以了解我的代码有什么问题。提前谢谢。
[label, data] = libsvmread('libsvm_data.scale');
[N D] = size(data);
numfold=5;
indices = crossvalind ('Kfold',label, numfold);
cp = classperf(label);
for i= 1:numfold
disp(strcat('Fold-',int2str(i)));
testix = (indices == i); trainix = ~testix;
test_data = data(testix,:); test_label = label(testix);
train_data = data(trainix,:); train_label = label(trainix);
model = svmtrain(train_label, train_data, sprintf('-s 0 -t 0); %'
s = 1:D;
r = [];
iter = 1;
while ~isempty(s)
X = train_data(:,s);
fs_model = svmtrain(train_label, X, sprintf('-s 0 -t %f -c %f -g %f -b 1', kernel, cost, gamma));
w = fs_model.SVs' * fs_model.sv_coef; %'
c = w.^2;
[c_minvalue, f] = min(c);
r = [s(f),r];
ind = [1:f-1, f+1:length(s)];
s = s(ind);
iter = iter + 1;
end
predefined = 100;
important_feat = r(:,D-predefined+1:end);
for l=1:length(important_feat)
testdata(:,l) = test_data (:,important_feat(l));
end
[predict_label_itest, accuracy_itest, prob_values] = svmpredict(test_label, testdata, model,'-b 1');
acc_itest_fs (:,i) = accuracy_itest(1);
clear testdata;
end
Mean_itest_fs = mean((acc_itest_fs),2);
Mean_bac_fs = mean (bac_fs,2);
答案 0 :(得分:0)
将RFE应用于traindata后,您将获得该traindata的子集。因此,当您使用traindata训练模型时,我认为您应该使用traindata的子集来训练该模型。