我有以下代码(C ++ 0x):
const set<char> s_special_characters = { '(', ')', '{', '}', ':' };
void nectar_loader::tokenize( string &line, const set<char> &special_characters )
{
auto it = line.begin();
const auto not_found = special_characters.end();
// first character special case
if( it != line.end() && special_characters.find( *it ) != not_found )
it = line.insert( it+1, ' ' ) + 1;
while( it != line.end() )
{
// check if we're dealing with a special character
if( special_characters.find(*it) != not_found ) // <----------
{
// ensure a space before
if( *(it-1) != ' ' )
it = line.insert( it, ' ' ) + 1;
// ensure a space after
if( (it+1) != line.end() && *(it+1) != ' ' )
it = line.insert( it+1, ' ');
else
line.append(" ");
}
++it;
}
}
崩溃指向指示的行。这导致与此gdb回溯的段错误:
#0 0x000000000040f043 in std::less<char>::operator() (this=0x622a40, __x=@0x623610, __y=@0x644000)
at /usr/lib/gcc/x86_64-unknown-linux-gnu/4.5.2/../../../../include/c++/4.5.2/bits/stl_function.h:230
#1 0x000000000040efa6 in std::_Rb_tree<char, char, std::_Identity<char>, std::less<char>, std::allocator<char> >::_M_lower_bound (this=0x622a40, __x=0x6235f0, __y=0x622a48, __k=@0x644000)
at /usr/lib/gcc/x86_64-unknown-linux-gnu/4.5.2/../../../../include/c++/4.5.2/bits/stl_tree.h:1020
#2 0x000000000040e840 in std::_Rb_tree<char, char, std::_Identity<char>, std::less<char>, std::allocator<char> >::find (this=0x622a40, __k=@0x644000)
at /usr/lib/gcc/x86_64-unknown-linux-gnu/4.5.2/../../../../include/c++/4.5.2/bits/stl_tree.h:1532
#3 0x000000000040e4fd in std::set<char, std::less<char>, std::allocator<char> >::find (this=0x622a40, __x=@0x644000)
at /usr/lib/gcc/x86_64-unknown-linux-gnu/4.5.2/../../../../include/c++/4.5.2/bits/stl_set.h:589
#4 0x000000000040de51 in ambrosia::nectar_loader::tokenize (this=0x7fffffffe3b0, line=..., special_characters=...)
at ../../ambrosia/Library/Source/Ambrosia/nectar_loader.cpp:146
#5 0x000000000040dbf5 in ambrosia::nectar_loader::fetch_line (this=0x7fffffffe3b0)
at ../../ambrosia/Library/Source/Ambrosia/nectar_loader.cpp:112
#6 0x000000000040dd11 in ambrosia::nectar_loader::fetch_token (this=0x7fffffffe3b0, token=...)
at ../../ambrosia/Library/Source/Ambrosia/nectar_loader.cpp:121
#7 0x000000000040d9c4 in ambrosia::nectar_loader::next_token (this=0x7fffffffe3b0)
at ../../ambrosia/Library/Source/Ambrosia/nectar_loader.cpp:72
#8 0x000000000040e472 in ambrosia::nectar_loader::extract_nectar<std::back_insert_iterator<std::vector<ambrosia::target> > > (this=0x7fffffffe3b0, it=...)
at ../../ambrosia/Library/Source/Ambrosia/nectar_loader.cpp:43
#9 0x000000000040d46d in ambrosia::drink_nectar<std::back_insert_iterator<std::vector<ambrosia::target> > > (filename=..., it=...)
at ../../ambrosia/Library/Source/Ambrosia/nectar.cpp:75
#10 0x00000000004072ae in ambrosia::reader::event (this=0x623770)
我很茫然,并且不知道我做错了什么。非常感谢任何帮助。
编辑:崩溃时的字符串是
sub Ambrosia:lib libAmbrosia
更新:
我根据评论/答案中的建议替换了上述功能。以下是结果。
const string tokenize( const string &line, const set<char> &special_characters )
{
const auto not_found = special_characters.end();
const auto end = line.end();
string result;
if( !line.empty() )
{
// copy first character
result += line[0];
char previous = line[0];
for( auto it = line.begin()+1; it != end; ++it )
{
const char current = *it;
if( special_characters.find(previous) != not_found )
result += ' ';
result += current;
previous = current;
}
}
return result;
}
答案 0 :(得分:6)
另一个猜测是line.append(" ")
有时会使it
无效,具体取决于该行的原始容量。
答案 1 :(得分:2)
在您第一次取消引用it != line.end()
之前,您不会检查it
。
答案 2 :(得分:0)
我无法发现错误,我建议您使用调试器慢慢迭代,因为您已经确定了问题。
我只是这样,一般来说,修改你正在迭代的东西很容易失败。
我建议使用Boost Tokenizer,更确切地说:boost::token_iterator
结合boost::char_separator
(包括代码示例)。
然后,您可以从第一个开始构建一个新的string
,并从函数中返回新的字符串。计算速度应该包括内存分配。