我在c ++中有一个随机森林的实现,我在matlab中通过mex运行。它运行平稳,直到它到达下面的功能,它会卡住并开始消耗内存,直到计算机冻结。
void MyFunction(
const IDataPointCollection& data,
std::vector<std::vector<int> >& leafNodeIndices,
ProgressStream* progress=0 ) const
{
ProgressStream defaultProgressStream(std::cout, Interest);
progress = (progress==0)?&defaultProgressStream:progress;
leafNodeIndices.resize(TreeCount());
tbb::parallel_for<int>(0,TreeCount(),[&](int t)
{
leafNodeIndices[t].resize(data.Count());
(*progress)[Interest] << "\rApplying tree " << t << "...";
trees_[t]->Apply(data, leafNodeIndices[t]);
});
(*progress)[Interest] << "STUCK HERE" << std::endl;
return;
}
通过上面trees_[t]->Apply()
的代码,我能够将其缩小到下面的递归函数:
void ApplyNode(
int nodeIndex,
const IDataPointCollection& data,
std::vector<unsigned int>& dataIndices,
int i0,
int i1,
std::vector<int>& leafNodeIndices,
std::vector<float>& responses_)
{
std::cout<<"applying node"<<std::endl;
assert(nodes_[nodeIndex].IsNull()==false);
Node<F,S>& node = nodes_[nodeIndex];
if (node.IsLeaf())
{
for (int i = i0; i < i1; i++)
leafNodeIndices[dataIndices[i]] = nodeIndex;
return;
}
else if (i0 == i1) // No samples left
return;
else
{
for (int i = i0; i < i1; i++)
responses_[i] = node.Feature.GetResponse(data, dataIndices[i]);
int ii = Partition(responses_, dataIndices, i0, i1, node.Threshold);
// Recurse for child nodes.
ApplyNode(nodeIndex * 2 + 1, data, dataIndices, i0, ii, leafNodeIndices, responses_);
ApplyNode(nodeIndex * 2 + 2, data, dataIndices, ii, i1, leafNodeIndices, responses_);
return;
}
}
对递归函数的每次调用都有不同的计算时间,具体取决于node.Feature.GetResponse()
函数。如果我使所有递归调用的计算时间相同(通过更改GetResponse()
),代码将顺利运行。
float AxisAlignedFeatureResponse::GetResponse(const IDataPointCollection& data, int index) const {
double retArg;
// retrieve DataManager object
const DataManager& concreteData = (const DataManager&)(data);
// // retrieve data point at index
DataPoint currDataPoint = concreteData.getDataPoint(index);
//
// // get coordinates of data point
Coordinate currCoordinates = currDataPoint.getOrigPos();
//
// // get intensity image of the respective data point
int imgIndex = currDataPoint.getImageIndex();
Image currImg = concreteData.getImage(imgIndex);
Image currFeatureImg = concreteData.getFeatureImage(imgIndex);
// return respective feature
int featureNumber = (int)(this->axis*(double)concreteData.getNumberOfFeatures());
if(featureNumber>=concreteData.getNumberOfFeatures()){
cout<<"warning! trying to reach a feature that is not there!"<<endl;
featureNumber=concreteData.getNumberOfFeatures()-1;
}
std::vector<Coordinate> feature = concreteData.getFeature(featureNumber);
Coordinate tmp=currCoordinates+feature[0];
if(feature[1].x == 0) {
retArg = currCoordinates.x*feature[0].x+currCoordinates.y*feature[0].y+currCoordinates.z*feature[0].z;
//retArg = 0; //DOING THIS runs the code smoothly
}
else if(feature[1].x == 2) {
retArg = currFeatureImg.getValue(feature[0]);
} else {
retArg = currImg.mean(tmp,feature[1]);
}
return (float)(retArg);
//return (float) 0;
}
答案 0 :(得分:0)
这看起来像是valgrind的工作。除非您的程序突然终止,否则我看不出任何导致内存泄漏的原因。内存泄漏通常是由未释放的堆变量引起的(“new”后面没有后续的“delete”)。
Valgrind将为您提供罪魁祸首和行号,您可以设置断点并逐步使用gdb查看究竟发生了什么。也许发布运行“valgrind -v your_program”的详细输出。不要忘记使用-g选项进行编译,以便为valgrind和gdb提供完整的调试数据。