boost tokenizer / char分隔符

时间:2016-12-12 20:45:35

标签: c++ boost tokenize

我已尝试使用已注释和未通过的代码版本:

string separator1(""); //dont let quoted arguments escape themselves
string separator2(",\n"); //split on comma and newline
string separator3("\"\'"); //let it have quoted arguments

escaped_list_separator<char> els(separator1, separator2, separator4);
tokenizer<escaped_list_separator<char>> tok(str);//, els);


for (tokenizer<escaped_list_separator<char>>::iterator beg = tok.begin();beg!= tok.end(); ++beg) {
next = *beg;
boost::trim(next);
cout << counter << " " << next << endl;
counter++;
}

分隔具有以下格式的文件:

 12345, Test Test, Test
 98765, Test2 test2, Test2

这是输出

0 12345
1 Test Test
2 Test
98765
3 Test2 test2
4 Test2

我不确定问题出在哪里,但我需要做的是在98765之前得到3号

2 个答案:

答案 0 :(得分:1)

您忘记了换行符分隔符:string separator2(",\n");

#include <iostream>
#include <boost/tokenizer.hpp>
#include <boost/algorithm/string.hpp>

using namespace std;

   using namespace boost;

int main() {
    string str = "TEst,hola\nhola";
    string separator1(""); //dont let quoted arguments escape themselves
    string separator2(",\n"); //split on comma and newline
    string separator3("\""); //let it have quoted arguments

    escaped_list_separator<char> els(separator1, separator2, separator3);
    tokenizer<escaped_list_separator<char>> tok(str, els);

    int counter = 0, current_siding = 0, wagon_pos = 0, cur_vector_pos = 0;

    string next;

    for (tokenizer<escaped_list_separator<char>>::iterator beg = tok.begin();     beg != tok.end(); ++beg) {
        next = *beg;
        boost::trim(next);
        cout << counter << " " << next << endl;
        counter++;

    }
    return 0;
}  

答案 1 :(得分:0)

在我看来,你正在解析,而不是分裂。

使用解析器生成器将是优秀的IMO

<强> Live On Coliru

#include <boost/spirit/include/qi.hpp>
namespace qi = boost::spirit::qi;

int main() {
    boost::spirit::istream_iterator f(std::cin >> std::noskipws), l;

    std::vector<std::string> columns;
    qi::parse(f, l, +~qi::char_(",\r\n") % (qi::eol | ','), columns);

    size_t n = 0;
    for(auto& tok : columns) { std::cout << n++ << "\t" << tok << "\n"; }
}

打印

0   12345
1    Test Test
2    Test
3   98765
4    Test2 test2
5    Test2

坦率地说,我觉得它更优越,因为它可以让你写

phrase_parse(f, l, (qi::_int >> *(',' >> +~qi::char_("\r\n,")) % qi::eol, qi::blank...);

正确解析数据类型,空格跳过等等#34; free&#34;