提升精神解析器展望解析

时间:2017-05-24 22:31:33

标签: c++ parsing c++11 boost boost-spirit

我想解析一个形式为的字符串: string_number。 我不知道如何为boost qi解析器编写语法。

现在我的语法看起来像: +qi::char_("a-zA-Z0-9_-") >> lit('_') >> qi::int_

但看起来不像是有效的。 示例字符串是: ab_bcd_123 - >令牌(ab_bcd,123) ab_123 --->代币(ab,123)

1 个答案:

答案 0 :(得分:1)

  

但看起来并不合适。

那是因为0-9吃了数字。这应该有效:

+qi::char_("a-zA-Z_") >> '_' >> qi::uint_

如果您也想允许ab-3_bcd_123,请设备预先检测您是否已到达终点,例如eoi

qi::raw[
    (+qi::alnum|'-') % (!('_' >> qi::uint_ >> eoi))
] >> '_' >> qi::uint_
不过,到现在为止,我只是忘了它并做到了:

qi::lexeme [ +qi::char_("a-zA-Z0-9_-") ] [ _val = split_ident(_1) ];

参见 Live On Coliru

#include <boost/fusion/adapted/std_pair.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>

namespace qi = boost::spirit::qi;

using NumberedIdent = std::pair<std::string, int>;

namespace Demo {
    struct SplitIdent {
        NumberedIdent operator()(std::vector<char> const& v, bool& pass) const {
            std::string s(v.begin(), v.end());
            try {
                auto n = s.rfind('_');
                pass = n > 0;
                return { s.substr(0, n), std::stoi(s.substr(n+1)) };
            } catch(...) { 
                pass = false; return {s, 0}; 
            }
        }
    };

    using It = std::string::const_iterator;
    using namespace qi;

    static boost::phoenix::function<SplitIdent> split_ident;

    rule<It, NumberedIdent()> const rule
        = lexeme [ +char_("a-zA-Z0-9_-") ] [ _val = split_ident(_1, _pass) ];
}

int main() {
    for (std::string const input : {
           "ab_bcd_123",
           "ab-3_bcd_123 = 'something'",
           // failing:
           "ab_bcd_123_q = 'oops'",
           "ab_bcd_123_ = 'oops'",
           "_123 = 'oops'",
           "_",
           "q",
           ""
           }) 
    {
        NumberedIdent parsed;
        Demo::It f = input.begin(), l = input.end();

        bool ok = parse(f, l, Demo::rule, parsed);

        if (ok) {
            std::cout << "SUCCESS: ['" << parsed.first << "', " << parsed.second << "]\n";
        } else {
            std::cout << "parse failed ('" << input << "')\n";
        }

        if (f != l) {
            std::cout << "  remaining input '" << std::string(f,l) << "'\n";
        }
    }

}

打印:

SUCCESS: ['ab_bcd', 123]
SUCCESS: ['ab-3_bcd', 123]
  remaining input ' = 'something''

然后所有失败的测试用例(按设计):

parse failed ('ab_bcd_123_q = 'oops'')
  remaining input 'ab_bcd_123_q = 'oops''
parse failed ('ab_bcd_123_ = 'oops'')
  remaining input 'ab_bcd_123_ = 'oops''
parse failed ('_123 = 'oops'')
  remaining input '_123 = 'oops''
parse failed ('_')
  remaining input '_'
parse failed ('q')
  remaining input 'q'
parse failed ('')