我正在研究一个查询处理器,它从内存中读取长文档ID列表并查找匹配的id。当找到一个时,它会创建一个DOC结构,其中包含docid(一个int)和文档的rank(一个double)并将其推送到优先级队列。我的问题是,当搜索的单词有一个很长的列表时,当我尝试将DOC推送到队列时,我得到以下异常: QueryProcessor.exe中0x7c812afb处的未处理异常:Microsoft C ++异常:内存位置为0x0012ee88的std :: bad_alloc ..
当单词有一个短列表时,它可以正常工作。我尝试在我的代码中的几个地方将DOC推送到队列中,它们都工作到某一行;之后,我得到了上述错误。我完全不知道出了什么问题,因为读入的最长列表小于1 MB而且我释放了我分配的所有内存。当我尝试将DOC推送到具有容纳它的容量的队列时,为什么突然出现bad_alloc异常(我使用了保留足够空间的向量作为优先级队列的底层数据结构)?
我知道在没有看到所有代码的情况下,这样的问题几乎是不可能回答的,但是这里发布的时间太长了。我尽可能多地投入,并且焦急地希望有人可以给我一个答案,因为我在我的智慧结束。
NextGEQ函数逐块读取压缩的docids列表。也就是说,如果它看到块中的lastdocid(在单独的列表中)大于传入的docid,它会解压缩块并搜索直到找到正确的块。每个列表都以关于列表的元数据开始,每个压缩块的长度和块中的最后一个docid。 data.iquery指向元数据的开头; data.metapointer指向函数当前所在的元数据中的任何位置;和data.blockpointer指向未压缩的docids块的开头(如果有的话)。如果它看到它已经解压缩,它只是搜索。下面,当我第一次调用该函数时,它解压缩一个块并找到docid;在工作之后推入队列。第二次,它甚至不需要解压缩;也就是说,没有分配新内存,但在此之后,推送到队列会产生bad_alloc错误。
编辑:我清理了我的代码,以便编译。我还添加了OpenList()和NextGEQ函数,虽然后者很长,因为我认为这个问题是由于其中某处的堆损坏引起的。非常感谢!
struct DOC{
long int docid;
long double rank;
public:
DOC()
{
docid = 0;
rank = 0.0;
}
DOC(int num, double ranking)
{
docid = num;
rank = ranking;
}
bool operator>( const DOC & d ) const {
return rank > d.rank;
}
bool operator<( const DOC & d ) const {
return rank < d.rank;
}
};
struct listnode{
int* metapointer;
int* blockpointer;
int docposition;
int frequency;
int numberdocs;
int* iquery;
listnode* nextnode;
};
void QUERYMANAGER::SubmitQuery(char *query){
listnode* startlist;
vector<DOC> docvec;
docvec.reserve(20);
DOC doct;
//create a priority queue to use as a min-heap to store the documents and rankings;
priority_queue<DOC, vector<DOC>,std::greater<DOC>> q(docvec.begin(), docvec.end());
q.push(doct);
//do some processing here; startlist is a pointer to a listnode struct that starts the //linked list
//point the linked list start pointer to the node returned by the OpenList method
startlist = &OpenList(value);
listnode* minpointer;
q.push(doct);
//start by finding the first docid in the shortest list
int i = 0;
q.push(doct);
num = NextGEQ(0, *startlist);
q.push(doct);
while(num != -1)
{
q.push(doct);
//the is where the problem starts - every previous q.push(doct) works; the one after
//NextGEQ(num +1, *startlist) gives the bad_alloc error
num = NextGEQ(num + 1, *startlist);
//this is where the exception is thrown
q.push(doct);
}
}
//takes a word and returns a listnode struct with a pointer to the beginning of the list
//and metadata about the list
listnode QUERYMANAGER::OpenList(char* word)
{
long int numdocs;
//create a new node in the linked list and initialize its variables
listnode n;
n.iquery = cache -> GetiList(word, &numdocs);
n.docposition = 0;
n.frequency = 0;
n.numberdocs = numdocs;
//an int pointer to point to where in the metadata you are
n.metapointer = n.iquery;
n.nextnode = NULL;
//an int pointer to point to the uncompressed block of data, if there is one
n.blockpointer = NULL;
return n;
}
int QUERYMANAGER::NextGEQ(int value, listnode& data)
{
int lengthdocids;
int lengthfreqs;
int lengthpos;
int* temp;
int lastdocid;
lastdocid = *(data.metapointer + 2);
while(true)
{
//if it's not the first chunk in the list, the blockpointer will be pointing to the
//most recently opened block and docpos to the current position in the block
if( data.blockpointer && lastdocid >= value)
{
//if the last docid in the chunk is >= the docid we're looking for,
//go through the chunk to look for a match
//the last docid in the block is in lastdocid; keep going until you hit it
while(*(data.blockpointer + data.docposition) <= lastdocid)
{
//compare each docid with the docid passed in; if it's greater than or equal to it, return a pointer to the docid
if(*(data.blockpointer + data.docposition ) >= value)
{
//return the next greater than or equal docid
return *(data.blockpointer + data.docposition);
}
else
{
++data.docposition;
}
}
//read through the whole block; couldn't find matching docid; increment metapointer to the next block;
//free the block's memory
data.metapointer += 3;
lastdocid = *(data.metapointer + 3);
free(data.blockpointer);
data.blockpointer = NULL;
}
//reached the end of a block; check the metadata to find where the next block begins and ends and whether
//the last docid in the block is smaller or larger than the value being searched for
//first make sure that you haven't reached the end of the list
//if the last docid in the chunk is still smaller than the value passed in, move the metadata pointer
//to the beginning of the next chunk's metadata; read in the new metadata
while(true)
// while(*(metapointers[index]) != 0 )
{
if(lastdocid < value && *(data.metapointer) !=0)
{
data.metapointer += 3;
lastdocid = *(data.metapointer + 2);
}
else if(*(data.metapointer) == 0)
{
return -1;
}
else
//we must have hit a chunk whose lastdocid is >= value; read it in
{
//read in the metadata
//the length of the chunk of docid's is cumulative, so subtract the end of the last chunk
//from the end of this chunk to get the length
//find the end of the metadata
temp = data.metapointer;
while(*temp != 0)
{
temp += 3;
}
temp += 2;
//temp is now pointing to the beginning of the list of compressed data; use the location of metapointer
//to calculate where to start reading and how much to read
//if it's the first chunk in the list,the corresponding metapointer is pointing to the beginning of the query
//so the number of bytes of docid's is just the first integer in the metadata
if( data.metapointer == data.iquery)
{
lengthdocids = *data.metapointer;
}
else
{
//start reading from the offset of the end of the last chunk (saved in metapointers[index] - 3)
//plus 1 = the beginning of this chunk
lengthdocids = *(data.metapointer) - (*(data.metapointer - 3));
temp += (*(data.metapointer - 3)) / sizeof(int);
}
//allocate memory for an array of integers - the block of docid's uncompressed
int* docblock = (int*)malloc(lengthdocids * 5 );
//decompress docid's into the block of memory allocated
s9decompress((int*)temp, lengthdocids /4, (int*) docblock, true);
//set the blockpointer to point to the beginning of the block
//and docpositions[index] to 0
data.blockpointer = docblock;
data.docposition = 0;
break;
}
}
}
}
非常感谢,bsg。
答案 0 :(得分:1)
QUERYMANAGER::OpenList
按值返回listnode。然后在startlist = &OpenList(value);
中继续获取返回的临时对象的地址。当临时消失时,您可能会在一段时间内访问数据然后被覆盖。你能在堆栈上声明一个非指针listnode起始列表并直接为它赋值吗?然后在其他用途前删除*,看看是否能解决问题。
答案 1 :(得分:1)
你可以尝试的另一件事是用智能指针替换所有指针,特别是像boost::shared_ptr<>
这样的指针,具体取决于这个代码实际上是多少以及你自动完成任务的程度。智能指针不是一切的答案,但它们至少比原始指针更安全。
答案 2 :(得分:0)
假设你有堆损坏并且实际上并没有耗尽内存,堆最常见的方式就是通过删除(或释放)相同的指针两次。您可以通过简单地注释掉所有要删除(或免费)的调用来轻松找出这是否是问题。这将导致您的程序像筛子一样泄漏,但如果它实际上没有崩溃,您可能已经发现了问题。
损坏堆的另一个常见原因是删除(或释放)未在堆上分配的指针。区分腐败的两个原因并不总是容易的,但你的首要任务应该是找出腐败是否真的是问题。
请注意,如果您要删除的内容具有析构函数,如果未调用析构函数会破坏程序的语义,则此方法将无法正常工作。
答案 3 :(得分:0)
感谢您的帮助。你是对的,尼尔 - 我必须设法破坏我的堆。我仍然不确定是什么导致它,但当我将malloc(numdocids * 5)更改为malloc(256)时,它神奇地停止了崩溃。我想我应该检查我的mallocs是否真的成功了!再次感谢! BSG