使用cv :: ml :: StatModel :: calcError的Opencv错误与在选定功能子集上训练的模型

时间:2017-02-24 13:00:01

标签: c++ opencv

使用Opencv 3.2.0,我试图通过使用cv :: ml :: TrainData :: create()中的varIdx向量来计算使用特征子集创建的SVM模型的训练和测试错误。这是c ++代码的相关部分。

using namespace cv;
using namespace cv::ml;
using namespace std;

// Code to read samples and responses from external data file not shown...

 // Copy vector to Mat
Mat matSamples(samples.size(), samples.at(0).size(), CV_32F);
for(int i = 0; i < matSamples.rows; i++) {
  for(int j = 0; j < matSamples.cols; j++) {
      matSamples.at<float>(i, j) = samples.at(i).at(j);
  }
}

// Copy vector to Mat
Mat matResponses(responses.size(), 1, CV_32SC1);
for(int i = 0; i < matResponses.rows; i++) {
      matResponses.at<int>(i) = responses.at(i);
}

// Create Mat to specify training variables (features)
Mat matVarIdx(1, 7, CV_32SC1);
matVarIdx = (Mat_<int>(1, 7) << 0, 15, 26, 27, 28, 29, 31);
cout << "Using features specified by " << matVarIdx << endl;

// Construct training data from samples read from file above
Ptr<TrainData> td = TrainData::create(
                    matSamples,                 // Array of samples
                    ROW_SAMPLE,                 // Data in rows
                    matResponses,               // Array of responses
                    matVarIdx,                  // Use features specified
                    noArray(),                  // Use all data points
                    noArray(),                  // Do not use samples weights
                    noArray()                   // Do not specify inp and out types
                    );

// Split training and test data
double ratio = 0.90; // 90% of samples will be labled training data
bool shuffle = true; // randomly shuffle test and training data
td->setTrainTestSplitRatio(ratio, shuffle);
int n_train_samples = td->getNTrainSamples();
int n_test_samples  = td->getNTestSamples();

cout << "Found " << n_train_samples << " Train Samples, and "
     << n_test_samples << " Test Samples." << endl;

// Output number of features
cout << "Total number of features " << td->getNAllVars() << " and "
     << td->getNVars() << " features used." << endl;

// Set up SVM's parameters
Ptr<SVM> svm = SVM::create();
svm->setType(SVM::C_SVC);
svm->setKernel(SVM::RBF);
svm->setTermCriteria(TermCriteria(TermCriteria::MAX_ITER, 1000, FLT_EPSILON));

// Train the SVM with given parameters
svm->train(td);

// Calculate errors.
Mat results;
float train_performance = svm->StatModel::calcError(
                                         td,
                                         false, // use train data
                                         results);

cout << "Incorrectly classified training samples: " << train_performance << "%" << endl;

float test_performance = svm->StatModel::calcError(
                                        td,
                                        true, // use test data
                                        results);

cout << "Incorrectly classified test samples: " << test_performance << "%" << endl;

这是程序的输出:

Using features specified by [0, 15, 26, 27, 28, 29, 31]
Found 267 Train Samples, and 30 Test Samples.
Total number of features 32 and 7 features used.
OpenCV Error: Assertion failed (samples.cols == var_count && samples.type() == CV_32F) in predict, file /home/lindo/dev/opencv/opencv-3.2.0/modules/ml/src/svm.cpp, line 1930
terminate called after throwing an instance of 'cv::Exception'
what():  /home/lindo/dev/opencv/opencv-3.2.0/modules/ml/src/svm.cpp:1930: error: (-215) samples.cols == var_count && samples.type() == CV_32F in function predict
Aborted

看起来用于计算错误的预测失败了,因为样本列的数量不等于我打算在创建训练数据时使用varIdx设置的要素数。

当我使用完整数量的功能时,此代码有效,即在cv:ml :: TrainData :: create()中设置varIdx = cv :: noArray()。

我尝试使用向量而不是mat用于varIdx以及为varIdx使用CV_8UC1 Mat但仍然得到相同的断言错误。

任何帮助都非常感谢!

1 个答案:

答案 0 :(得分:0)

查看文档here,解释了varIdx向量用于识别感兴趣的特征,并作为基于0的索引列表或活动变量的掩码

这样,您可以考虑创建一个var_dx向量number_of_features-sized,并为要考虑的变量赋予0或1值。

我认为您的错误发生是因为varIdx只包含7个值而不是32个,并且在预测时OpenCV只预计7个但是给出了32个。当您没有指定时,这个错误不会发生感兴趣的变量,因为算法将每个变量考虑在同等重量下。

另外,请不要犹豫,看看图书馆的代码。您的编译器为您提供了发生错误的位置,因此它可以帮助您理解代码失败的原因。