我在C ++中实现后缀trie。 Trie
构造函数的实现如下所示。
#include <iostream>
#include <cstring>
#include "Trie.hpp"
using namespace std;
Trie::Trie(string T){
T += "#"; //terminating character
this->T = T;
nodes.reserve(T.length() * (T.length() + 1) / 2); //The number of nodes is bounded above by n(n+1)/2. The reserve prevents reallocation (http://stackoverflow.com/questions/41557421/vectors-and-pointers/41557463)
vector<string> suffix; //vector of suffixes
for(unsigned int i = 0; i < T.length(); i++)
suffix.push_back(T.substr(i, T.length()-i));
//Create the Root, and start from it
nodes.push_back(Node("")); //root has blank label
Node* currentNode = &nodes[0];
//While there are words in the array of suffixes
while(!suffix.empty()){
//If the character under consideration already has an edge, then this will be its index. Otherwise, it's -1.
int edgeIndex = currentNode->childLoc(suffix[0].at(0));
//If there is no such edge, add the rest of the word
if(edgeIndex == -1){
addWord(currentNode, suffix[0]); //add rest of word
suffix.erase(suffix.begin()); //erase the suffix from the suffix vector
}
//if there is
else{
currentNode = (currentNode->getEdge(edgeIndex))->getTo(); //current Node is the next Node
suffix[0] = suffix[0].substr(1, suffix[0].length()); //remove first character
}
}
}
//This function adds the rest of a word
void Trie::addWord(Node* parent, string word){
for(unsigned int i = 0; i < word.length(); i++){ //For each remaining letter
nodes.push_back(Node(parent->getLabel()+word.at(i))); //Add a node with label of parent + label of edge
Edge e(word.at(i), parent, &nodes.back()); //Create an edge joining the parent to the node we just added
parent->addEdge(e); //Join the two with this edge
}
}
我正在使用两个数据结构,Node
和Edge
,它们具有您期望的一些getter和setter以及属性。方法childLoc()
返回表示给定字符的边(如果存在)的位置。
代码编译得很好,但由于某种原因我在运行时遇到了这个错误:
terminate called after throwing an instance of 'std::out_of_range'
what(): basic_string::at: __n (which is 0) >= this->size() (which is 0)
Aborted (core dumped)
我被告知这个错误意味着我正在访问空字符串的第一个字符,但我无法看到代码中发生了这种情况。
答案 0 :(得分:0)
我看到两个可能对std::out_of_range
负责的代码部分:
首先:以下表达式可能会访问位置0
的空字符串。这可能发生(如第二部分所示),缩小suffix
中包含的字符串 - vector:
int edgeIndex = currentNode->childLoc(suffix[0].at(0));
其次,你对suffix
- vector中的条目进行操作,存在字符串缩短的风险:
suffix[0] = suffix[0].substr(1, suffix[0].length());
如果第一个操作数(即substr
- 参数)超出数组长度(参见string::substr),则操作std::out_of_range
也将产生pos
:
pos
:要复制的第一个字符作为子字符串的位置。如果 这等于字符串长度,函数返回空 串。如果这大于字符串长度,则抛出 超出范围。注意:第一个字符由值0表示 (不是1)。
为了找出这些表达式中的哪一个实际上是异常的责任,我建议您咨询调试器: - )