Question

我在下面有以下代码，它解析文本文件并索引单词和行：

bool Database::addFromFileToListAndIndex(string path, BSTIndex* & index, list<Line *> & myList)
{
    bool result = false;
    ifstream txtFile;
    txtFile.open(path, ifstream::in);
    char line[200];
    Line * ln;
    //if path is valid AND is not already in the list then add it
    if(txtFile.is_open() && (find(textFilePaths.begin(), textFilePaths.end(), path) == textFilePaths.end())) //the path is valid
    {
        //Add the path to the list of file paths
        textFilePaths.push_back(path);
        int lineNumber = 1;
        while(!txtFile.eof())
        {
            txtFile.getline(line, 200);
            ln = new Line(line, path, lineNumber);
            if(ln->getLine() != "")
            {
                lineNumber++;
                myList.push_back(ln);
                vector<string> words = lineParser(ln);
                for(unsigned int i = 0; i < words.size(); i++)
                {
                    index->addWord(words[i], ln);
                }
            }
        }
        result = true;
    }
    return result;
}

我的代码完美无缺，并且相当快，直到我给它一个巨大的文本文件。然后我从Visual Studio得到堆栈溢出错误。当我切换到“释放”配置时，代码运行顺利。我的代码有什么问题，或者在运行“Debug”配置时是否存在某种限制？我想在一个功能中做太多吗？如果是这样，我如何分解它以便在调试时不会崩溃？

修改每个请求，我的addWord实现;

void BSTIndex::addWord(BSTIndexNode *& pCurrentRoot, string word, Line * pLine)
    {
        if(pCurrentRoot == NULL)  //BST is empty
        {
            BSTIndexNode * nodeToAdd = new BSTIndexNode();
            nodeToAdd->word = word;
            nodeToAdd->pData = pLine;
            pCurrentRoot = nodeToAdd;
            return;
        }
        //BST not empty
        if (word < (pCurrentRoot->word)) //Go left
        {
            addWord(pCurrentRoot->pLeft, word, pLine);
        }
        else //Go right
        {
            addWord(pCurrentRoot->pRight, word, pLine);
        }
    }

和lineParser：

vector<string> Database::lineParser(Line * ln) //Parses a line and returns a vector of the words it contains
{
    vector<string> result;
    string word;
    string line = ln->getLine();
    //Regular Expression, matches anything that is not a letter, number, whitespace, or apostrophe
    tr1::regex regEx("[^A-Za-z0-9\\s\\']");
    //Using regEx above, replaces all non matching characters with nothing, essentially removing them.
    line = tr1::regex_replace(line, regEx, std::string(""));

    istringstream iss(line);
    while(iss >> word)
    {
        word = getLowercaseWord(word);
        result.push_back(word);
    }
    return result;
}

Answer 1

堆栈溢出表示您的堆栈空间已用完（可能很明显，但以防万一）。典型的原因是非终止或过度递归，或非常大的堆栈对象重复。有趣的是，它可能就是这种情况。

很可能在Release中你的编译器正在进行尾调用优化，这会抑制堆栈溢出过多的递归。

在Release中你的编译器也可能正在从lineParser优化向量的返回副本。

所以你需要找出Debug中溢出的条件，我会以递归作为最可能的罪魁祸首，尝试将字符串参数类型更改为引用，即

void BSTIndex::addWord(BSTIndexNode *& pCurrentRoot, string & word, Line * pLine)

这应该可以阻止你在每个嵌套的addWord调用中复制word对象。

另外考虑添加std :: cout＆lt;＆lt; “递归addWord”＆lt;＆lt;的std :: ENDL; add语句的类型语句，以便您可以看到它的进展深度以及它是否正确终止。

Answer 2

问题几乎可以肯定是addWord中的递归调用 - 在非优化构建中，这会消耗大量的堆栈空间，而在优化构建中，编译器会将其转换为尾调用，重用相同的堆栈帧。

您可以非常轻松地将递归调用手动转换为循环：

void BSTIndex::addWord(BSTIndexNode ** pCurrentRoot, string word, Line * pLine)
{
    while (*pCurrentRoot != NULL) {
        //BST not empty
        if (word < (*pCurrentRoot)->word) //Go left
        {
            pCurrentRoot = &(*pCurrentRoot)->pLeft;
        }
        else //Go right
        {
            pCurrentRoot = &(*pCurrentRoot)->pRight;

        }
    }
    //BST is empty
    BSTIndexNode * nodeToAdd = new BSTIndexNode();
    nodeToAdd->word = word;
    nodeToAdd->pData = pLine;
    *pCurrentRoot = nodeToAdd;
}

Answer 3

你也应该发布你的堆栈，它实际上显示了导致溢出的原因。看起来相当明显的是，addWord中的递归显着消耗了堆栈内存。

如果您只想让它工作，请进入编译器/链接器设置并增加为堆栈保留的大小。默认情况下，它只有1MB，最高可达32MB或者其他任何东西，你可以放心，无论是什么额外的计数器或探测器，你都有足够的堆栈来处理它。

Answer 4

您可以将堆栈的大小增加到适当的字节数。

#pragma comment(linker, "/STACK:1000000000")

调试时堆栈溢出但未释放

4 个答案: