Question

我有以下用C ++编写的代码来提取Piece Table数据结构中的给定范围的文本。以下是类PieceTable的函数，它将给定范围的文本存储在字符数组buffer中：

void PieceTable::getTextInRange(unsigned __int64 startPos, unsigned __int64 endPos, char buffer[]){

    char* totalBuffer = new char[getSize() + 2];

    getBuffer(totalBuffer);

    if(endPos >= getSize())
        endPos = getSize() - 1; 

    cout<<"startPos : "<<startPos<<endl;
    cout<<"endPos : "<<endPos<<endl;

    memcpy(buffer, &totalBuffer[startPos], endPos - startPos + 1);

    buffer[endPos - startPos + 2] = '\0';

    if(totalBuffer != 0)
        delete[] totalBuffer;
    totalBuffer = 0;
}

以下是我用来测试此代码的main方法中的一段代码：

temp2 = new char[end - start + 2];  //changing 2 to 3 solves the problem
pieceTable.getTextInRange(Start, end, temp2);
for(int i = 0; i< end - start + 1; i++)
   cout<<temp2[i];
cout<<endl;

if( temp2 != 0)
{
  delete[] temp2;   //this line causes the heap corruption error
  temp2 = 0;
}

temp2的声明： char* temp2;

每当程序遇到delete[] temp2语句时，就会出现堆损坏错误。如果我将temp2的内存分配为：
，则不会出现此问题 temp2 = new char[end - start + 3] 所以，基本上改变长度可以解决问题。我知道我搞砸了某个地方的长度，但我无法弄清楚在哪里。

编辑： getSize（）：

__int64 PieceTable::getSize()
{
    return dList.getLength(dList.getBack());
}

我正在使用一个表格数据结构。在这篇文章中，http：//www.cs.unm.edu/~crowley/papers/sds.pdf

我可能错了，但我认为getSize()没有任何问题，因为我用来检索整个缓冲区getBuffer的长度的函数，如下所示码。

Answer 1

在PieceTable::getTextInRange中，你有这一行：

buffer[endPos - startPos + 2] = '\0';

当您将传递的内容分配为buffer时，就像这样分配：

temp2 = new char[end - start + 2];

让我们提供一些实数......

buffer[5 - 2 + 2] = '\0';

temp2 = new char[5 - 2 + 2];

相当于：

buffer[5] = '\0';

temp2 = new char[5];

嗯，这是你的问题。如果执行new char [5]，则会得到一个具有0到4之间有效索引的数组.5不是此数组的有效索引。

我可能会建议你制定一个规则，你只能在最常见的情况下打破你总是像STL那样用[开始，结束]指定范围。这意味着您指定一个超过最后一个所需索引的结尾。这使得范围计算数学更不容易出错。此外，界面与STL工作方式的一致性使其更易于使用。例如，使用此方案计算范围的大小始终为end - begin。

有old (circa 1982) paper by E.W. Dijkstra that gives some good reasons why this scheme for expressing ranges is the best one。

Answer 2

在代码中将2更改为3的原因：

temp2 = new char[end - start + 2];

作品是因为否则你会在getTextInRange中写出超过缓冲区的末尾（你被一个人关闭）。

上面的end和start对应endPos中startPos和getTextInRange的参数，getTextInRange中的参数{/ 1}}：

buffer[endPos - startPos + 2] = '\0';

数组的范围是[0, endPos - startPos + 2);因此，位置endPos - startPos + 2处的元素在数组末尾超过1。覆盖此值会导致堆损坏。

Answer 3

从您的代码中可以清楚地看出，您在getTextInRange中使用的最后一个索引是：

endPos-startPos+2 //last index

这几乎解释了为什么你需要分配大小的内存最小：

endPos-startPos+3 //number of objects : memory allocation

也就是说，如果为N个对象分配内存，则可以使用索引N-1访问数组中的最后一个对象，该索引也是数组的最大索引。索引N超出范围。回想一下，索引的标记为0，因此它必须以N-1结束，而不是N。

可能的缓冲区溢出问题

3 个答案: