具有大特征向量LibSVM的Segfault

时间:2015-10-17 03:49:03

标签: c++ android-ndk libsvm openframeworks

我在Android应用程序(NDK)上运行LibSVM。我在Mac应用程序上实现了类似的代码,适用于所有特征向量大小。当我给出408个特征的向量时,我在进行多类分类时没有问题。但是,任何409或更高的东西,(我最终会投入16800)似乎都失败了:

0-16 23:28:41.084 30997-31028/? A/libc: Fatal signal 11 (SIGSEGV), code 1, fault addr 0xaf000000 in tid 31028 (GLThread 17147)
10-16 23:28:41.190 27393-27393/? I/DEBUG: *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
10-16 23:28:41.191 27393-27393/? I/DEBUG: Build fingerprint: 'google/hammerhead/hammerhead:5.1.1/LMY48M/2167285:user/release-keys'
10-16 23:28:41.191 27393-27393/? I/DEBUG: Revision: '11'
10-16 23:28:41.191 27393-27393/? I/DEBUG: ABI: 'arm'
10-16 23:28:41.191 27393-27393/? I/DEBUG: pid: 30997, tid: 31028, name: GLThread 17147  >>> cc.openframeworks.androidEmptyExample <<<
10-16 23:28:41.191 27393-27393/? I/DEBUG: signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0xaf000000
10-16 23:28:41.202 27393-27393/? I/DEBUG:     r0 aef3e000  r1 aef5ed10  r2 00000001  r3 af000000
10-16 23:28:41.202 27393-27393/? I/DEBUG:     r4 aec29eb8  r5 00000001  r6 b4b2c608  r7 12d090c0
10-16 23:28:41.202 27393-27393/? I/DEBUG:     r8 12d15660  r9 b4a39400  sl 00000000  fp af37d824
10-16 23:28:41.202 27393-27393/? I/DEBUG:     ip b6e417dc  sp af37d810  lr a301ff78  pc a301ff04  cpsr 000f0010
10-16 23:28:41.202 27393-27393/? I/DEBUG:     #00 pc 00167f04  /data/app/cc.openframeworks.androidEmptyExample-1/lib/arm/libOFAndroidApp.so (Kernel::dot(svm_node const*, svm_node const*)+192)

以下是我学习的相关代码:

ofxSvm mSvm;
void ofApp::update()
{ //Runs in loop            
    for(int i =0; i<8400; ++i)
    {
        HandDataVector.push_back((double)tempValue[i]);//tempValue is   incoming data from a serial port (8400 doubles per packet)
    }
    //If I exclude this I get segfaults:
    HandDataVector.resize(150);
    if(learningToggleBoxTicked)
    {
        mSvm.addData(HandDataVector,label)
        mSvm.train();
    } else {
        ofLogNotice("Classified As")<< mSvm.classify();
    }
}

int ofApp::classify()
{
    return mSvm.predict(HandDataVector);
}

这是我使用

的xSvm图书馆
    int ofxSvm::addData(int label, vector<double>& vec)
    {
            checkDimension(vec.size());

            mData.insert(make_pair(label, vec));
            mDimension = vec.size();


            stringstream ss;
            for (const auto v : vec) ss << v << " ";
            ss << "EOS";
            ofLogNotice(LOG_MODULE, "add data, label: " + ofToString(label) + " size: "+ofToString(vec.size())+" vec: " + ss.str());


            return mData.size();
   }
   void ofxSvm::train()
   {


            svm_problem prob;

            prob.l = mData.size();

            prob.y = new double[prob.l];
            {
                data_type::iterator it = mData.begin();
                int i = 0;
                while (it != mData.end())
                {
                    prob.y[i] = it->first;
                    ++it; ++i;
                }
            }


            if(mParam.gamma == 0)
            {
                mParam.gamma = 1.0 / mDimension;
            }

            int nodeLength = mDimension + 1;
            svm_node* node = new svm_node[prob.l * nodeLength];
            prob.x = new svm_node*[prob.l];
            {
                data_type::iterator it = mData.begin();
                int i = 0;
                while (it != mData.end())
                {
                    prob.x[i] = node + i * nodeLength;
                    for (int j = 0; j < mDimension; ++j)
                    {
                        prob.x[i][j].index = j + 1;
                        prob.x[i][j].value = it->second[j];
                    }
                    prob.x[i][mDimension].index = -1; // delimiter
                    ++it; ++i;
                }
            }

            ofLogNotice(LOG_MODULE, "Start train...");

            mModel = svm_train(&prob, &mParam);



            ofLogNotice(LOG_MODULE, "Finished train!");

            int x = mModel->nr_class;

            ofLogNotice("TRAINED MODEL LABELS: " + ofToString(x));

            delete[] node;
            delete[] prob.x;
            delete[] prob.y;
        }

        int ofxSvm::predict(vector<double>& testVec)
        {
            if (mModel == NULL)
            {
                ofLogNotice(LOG_MODULE, "null model, before do train or load model file");
                return -5;
            }
            if (testVec.size() != mDimension)
            {
                ofLogNotice(LOG_MODULE, "different dimension");
                return -6;
            }


            svm_node* node = new svm_node[mDimension + 1];
            for (int i = 0; i < mDimension; ++i)
            {
                node[i].index = i + 1;
                node[i].value = testVec[i];
                ofLogNotice("node") << node[i].value <<"-" << i;

            }
            node[mDimension].index = -1;

            int res = static_cast<int>(svm_predict(mModel, node));


            stringstream ss;
            for (const auto v : testVec) ss << v << " ";
            ss << "EOS";
            ofLogNotice(LOG_MODULE, "add data, label: size: "+ofToString(testVec.size())+" vec: " + ss.str());


            ofLogNotice("ANSWER")<< res;


            delete[] node;
            return res;
    }

这里是LibSVM库中的功能,故障发生在:

double Kernel::dot(const svm_node *px, const svm_node *py)
{
    double sum = 0;
    while(px->index != -1 && py->index != -1)
    {
        if(px->index == py->index)
        {
            sum += px->value * py->value;
            ++px;
            ++py;
        }
        else
        {
            if(px->index > py->index)
                ++py;
            else
                ++px;
        }
    }
    return sum;
}

编辑:这里调用点函数(svm_predict_values中的k_function)

double svm_predict_values(const svm_model *model, const svm_node *x, double* dec_values)
{
    int i;
    if(model->param.svm_type == ONE_CLASS ||
       model->param.svm_type == EPSILON_SVR ||
       model->param.svm_type == NU_SVR)
    {
        double *sv_coef = model->sv_coef[0];
        double sum = 0;
        for(i=0;i<model->l;i++)
            sum += sv_coef[i] * Kernel::k_function(x,model->SV[i],model->param);
        sum -= model->rho[0];
        *dec_values = sum;

        if(model->param.svm_type == ONE_CLASS)
            return (sum>0)?1:-1;
        else
            return sum;
    }
    else
    {
        int nr_class = model->nr_class;
        int l = model->l;

        double *kvalue = Malloc(double,l);
        for(i=0;i<l;i++)
            kvalue[i] = Kernel::k_function(x,model->SV[i],model->param);

        int *start = Malloc(int,nr_class);
        start[0] = 0;
        for(i=1;i<nr_class;i++)
            start[i] = start[i-1]+model->nSV[i-1];

        int *vote = Malloc(int,nr_class);
        for(i=0;i<nr_class;i++)
            vote[i] = 0;

        int p=0;
        for(i=0;i<nr_class;i++)
            for(int j=i+1;j<nr_class;j++)
            {
                double sum = 0;
                int si = start[i];
                int sj = start[j];
                int ci = model->nSV[i];
                int cj = model->nSV[j];

                int k;
                double *coef1 = model->sv_coef[j-1];
                double *coef2 = model->sv_coef[i];
                for(k=0;k<ci;k++)
                    sum += coef1[si+k] * kvalue[si+k];
                for(k=0;k<cj;k++)
                    sum += coef2[sj+k] * kvalue[sj+k];
                sum -= model->rho[p];
                dec_values[p] = sum;

                if(dec_values[p] > 0)
                    ++vote[i];
                else
                    ++vote[j];
                p++;
            }

        int vote_max_idx = 0;
        for(i=1;i<nr_class;i++)
            if(vote[i] > vote[vote_max_idx])
                vote_max_idx = i;

        free(kvalue);
        free(start);
        free(vote);
        return model->label[vote_max_idx];
    }
}

double Kernel::k_function(const svm_node *x, const svm_node *y,
                          const svm_parameter& param)
{
    switch(param.kernel_type)
    {
        case LINEAR:
            return dot(x,y);
        case POLY:
            return powi(param.gamma*dot(x,y)+param.coef0,param.degree);
        case RBF:
        {
            double sum = 0;
            while(x->index != -1 && y->index !=-1)
            {
                if(x->index == y->index)
                {
                    double d = x->value - y->value;
                    sum += d*d;
                    ++x;
                    ++y;
                }
                else
                {
                    if(x->index > y->index)
                    {
                        sum += y->value * y->value;
                        ++y;
                    }
                    else
                    {
                        sum += x->value * x->value;
                        ++x;
                    }
                }
            }

            while(x->index != -1)
            {
                sum += x->value * x->value;
                ++x;
            }

            while(y->index != -1)
            {
                sum += y->value * y->value;
                ++y;
            }

            return exp(-param.gamma*sum);
        }
        case SIGMOID:
            return tanh(param.gamma*dot(x,y)+param.coef0);
        case PRECOMPUTED:  //x: test (validation), y: SV
            return x[(int)(y->value)].value;
        default:
            return 0;  // Unreachable 
    }
}

    double kernel_linear(int i, int j) const
    {
        return dot(x[i],x[j]);
    }

2 个答案:

答案 0 :(得分:1)

<强>简介

我很难在c ++中找到有关libSVM的任何文档。所以我的回答主要是基于您在没有真正参考文档的情况下向我们展示的代码。

如果您可以发布文档链接,那将非常有用:)

潜在问题

您展示了在培训代码中初始化svm_node*的代码:

prob.x[i][mDimension].index = -1;

您正在初始化svm_problem个实例。显然,在dot()函数中,您希望看到带有负索引的节点标记svm_node列表的结尾。

但是你这样调用时代码失败了:

Kernel::k_function(x,model->SV[i],model->param);

在这里,modelsvm_model,而不是svm_problem,所以我想这是在训练模型之后由libSVM 返回的。您确定svm_model遵循使用node->index = -1标记节点列表末尾的惯例吗?

因此,标记节点可能不存在,并且您已超出约束范围。

为什么它会突然突破409件

您获得的信号是SIGSEGV,表示您尝试访问未在流程中映射的页面中的字节。

当访问超出范围的项目时,您通常不会获得SIGSEGV,因为先前分配的项目位于内存页面的中间,并且页面中有足够的字节。可能会发生第409项刚好在当前分配的页面之后,触发信号。

答案 1 :(得分:0)

我在'.dot'函数中有一个类似的问题,我通过不包含'0'作为值来解决。正如@fjardon所提到的,该函数期望索引为-1且值为0来标记svm_nodes(训练向量)列表的末尾。

确定对您来说有点晚了,但希望它将对将来的libsvm用户有所帮助:)