使用string :: find和string :: substr拆分字符串的函数返回错误的标记

时间:2015-04-08 17:29:20

标签: c++ string

//splits a string into a vector of multiple tokens
std::vector<string> split_str(std::string& str, const char* delimiter){
    std::vector<string> ret;
    size_t currPos = 0;
    //Add the first element to the vector
    if (str.find(delimiter) != string::npos)
        ret.push_back(str.substr(currPos, str.find(delimiter)));


    while (currPos != str.size() - 1){

        if (str.find(delimiter, currPos) != string::npos){
            //Current at one past the delimiter
            currPos = str.find(delimiter, currPos) + 1;
            //Substring everything from one past the delimiter until the next delimiter
            ret.push_back(str.substr(currPos, str.find(delimiter, currPos)));
        }
        //If last whitespace is not right at the end
        else if (currPos < str.size()){
            //Add the last element to the vector and end the loop
            ret.push_back(str.substr(currPos, str.size()));
            currPos = str.size() - 1;
        }

    }
    return ret;
}

程序应该以字符串和分隔符作为输入,并返回字符串(标记)的向量作为输出。但是,当我尝试使用简单的输入时,例如:

ab bc cd de(分隔符是“”)

输出将是5个元素:“ab”,“bc cd”,“cd de”,“de”,“de”

2 个答案:

答案 0 :(得分:1)

问题是std::string::substr()的第二个参数是count而不是位置。您的代码应该从以下位置修改:

if (str.find(delimiter) != string::npos)
    ret.push_back(str.substr(currPos, str.find(delimiter)));

到此:

auto fpos = str.find(delimiter);
if (fpos != string::npos)
    ret.push_back(str.substr(currPos, fpos - currPos));
    //                                ^^^^^^^^^^^^^^

等等。

答案 1 :(得分:0)

使用find_first_of代替find更为正确。考虑到字符串中可能存在相邻的空格,而且字符串可以从空白开始。

这是一个示范性的图表,展示了如何编写函数

#include <iostream>
#include <string>
#include <vector>

std::vector<std::string> split_str( const std::string &s, const char *delimiter )
{
    std::vector<std::string> v;

    size_t n = 0;

    for ( std::string::size_type pos = 0;
          ( pos = s.find_first_not_of( delimiter, pos ) ) != std::string::npos;
          pos = s.find_first_of( delimiter, pos ) )
    {
        ++n;
    }        

    v.reserve( n );

    for ( std::string::size_type pos = 0;
          ( pos = s.find_first_not_of( delimiter, pos ) ) != std::string::npos; )
    {
        auto next_pos = s.find_first_of( delimiter, pos );

        if ( next_pos == std::string::npos ) next_pos = s.size();

        v.push_back( s.substr( pos, next_pos - pos ) );

        pos = next_pos;
    }        

    return v;
}


int main() 
{
    std::string s( "ab bc cd de " );

    std::cout << s << std::endl;    

    auto v = split_str( s, " " );

    for ( auto t : v ) std::cout << t << std::endl;

    return 0;
}

程序输出

ab bc cd de 
ab
bc
cd
de