使用Adaboost时访问和修改OpenCV决策树节点

时间:2016-02-25 23:10:00

标签: opencv machine-learning decision-tree adaboost

我正在从30000个随机生成的功能中学习提升树。学习仅限于说出最好的100个功能。在学习如何从CvBoost对象中提取决策树使用的特征的索引之后。

我这样做的动机是消除生成所有30000功能的要求,并且只计算将要使用的功能。我已经包含了从CvBoost.save函数生成的yml文件的打印输出。我想我想要的是名为sample_count的值,它在深度1的决策树中标识了如下所示的特征:

 trees:
      -
         best_tree_idx: -1
         nodes:
            -
               depth: 0
               sample_count: 11556
               value: -1.8339875099775065e+00
               norm_class_idx: 0
               Tn: 0
               complexity: 0
               alpha: 0.
               node_risk: 0.
               tree_risk: 0.
               tree_error: 0.
               splits:
                  - { var:497, quality:8.6223608255386353e-01,
                      le:5.3123302459716797e+00 }
            -
               depth: 1
               sample_count: 10702
               value: -1.8339875099775065e+00
               norm_class_idx: 0
               Tn: 0
               complexity: 0
               alpha: 0.
               node_risk: 0.
               tree_risk: 0.
               tree_error: 0.
            -
               depth: 1
               sample_count: 854
               value: 1.8339875099775065e+00
               norm_class_idx: 1
               Tn: 0
               complexity: 0
               alpha: 0.
               node_risk: 0.
               tree_risk: 0.
               tree_error: 0.

修改

目前,我有以下用于访问数据的代码:

//Interrogate the Decision Tree. Each element is a Decision Tree, making up the classifer
    CvSeq* decisionTree = boostDevice.get_weak_predictors();

    simplifyFeatureSet(decisionTree, firstOrderROIs );

这个功能是:

inline void Chnftrs::simplifyFeatureSet(CvSeq* decisionTree, std::vector<boost::tuple<int, cv::Rect> >& rois)
{
    //This variable stores the index of the feature used from rois and a pointer to the split so that the variable there can
    //be updated when the rois are pruned and reordered.
    std::vector<boost::tuple<int, CvDTreeSplit* > > featureIdx;

    //Determine the max depth of the tree

    printf("Size of boost %d \n", decisionTree->total);

    for (int i = 0; i < decisionTree->total; i++)
    {
            //Get the root of the tree
            CvBoostTree *tree =0;
            tree = (CvBoostTree*)cvGetSeqElem(decisionTree, i);

            if(tree == 0)
                printf("Tree is NULL\n");
            else
                printf("Tree Addr %ld\n", tree);            

            const CvDTreeNode *root = tree->get_root();

            printf("Class_idx %d, Value %f ", root->sample_count, root->value);

            featureIdx.push_back(boost::tuple<int, CvDTreeSplit*>(root->split->var_idx, root->split)); 

                    //Search down the right hand side
            depthFirstSearch(root->right, featureIdx);

            //Search down the left hand side
            depthFirstSearch(root->left, featureIdx);


    }
}

但是,当我尝试访问根root->sample_count中的任何根成员时,我会遇到分段错误。可能是CvTree的成员无法访问,除非CvTreeTrainData.shared设置为true(默认情况下为false)。如所示here

任何帮助都会很棒

欢呼声

彼得

1 个答案:

答案 0 :(得分:0)

好,

能够通过遵循源中的方法来编辑决策树,以了解CvBoost分类器如何保存并从磁盘读取自身。出于某种原因,对于Decision树类型对象,cvGetSeqElem()不会从传递给它的CvSeq对象返回有效指针。

为了获得决策树的副本,CvSeqReader和宏cvStartReadSeq效果最佳。在循环中获取Seq中的下一个树时,宏CV_READ_SEQ_ELEM()似乎会自行更新:

    CvSeqReader reader;
    cvStartReadSeq( weak, &reader );

     for (int i = 0; i < weak->total; i++)
        {

            CvBoostTree* tree;
            CV_READ_SEQ_ELEM( tree, reader );

                const CvDTreeNode *root = 0;
                root = tree->get_root();

                printf("Root Split VarIdx : %d c: %f, ", root->split->var_idx, root->split->ord.c);

                featureIdx.push_back(boost::tuple<int, CvDTreeSplit*>(root->split->var_idx, root->split)); 

                //Search down the right hand side
                depthFirstSearch(root->right, featureIdx);

                //Search down the left hand side
                depthFirstSearch(root->left, featureIdx);


        }