解析boost :: spirit上的递归结构

时间:2013-12-12 16:03:47

标签: c++ parsing boost-spirit boost-spirit-qi

我赢得了解析像“text {<>}”这样的结构。 Spirit文档内容类似于AST示例。 用于解析像这样的字符串

<tag1>text1<tag2>text2</tag1></tag2>

此代码工作:

    templ     = (tree | text)       [_val = _1];

    start_tag = '<' 
            >> !lit('/') 
            >> lexeme[+(char_- '>') [_val += _1]] 
            >>'>'; 

    end_tag   =  "</" 
            >> string(_r1) 
            >> '>'; 

    tree =  start_tag          [at_c<1>(_val) = _1]
            >> *templ          [push_back(at_c<0>(_val), _1) ]
            >> end_tag(at_c<1>(_val) )
            ;

用于解析像这样的字符串

<tag<tag>some_text>

此代码不起作用:

    templ     = (tree | text)       [_val = _1];


    tree =  '<'
            >> *templ          [push_back(at_c<0>(_val), _1) ]
            >> '>'
            ;

templ正在使用recursive_wrapper解析结构:

namespace client {

   struct tmp;

   typedef boost::variant <
        boost::recursive_wrapper<tmp>,
        std::string
   > tmp_node;

   struct tmp {
     std::vector<tmp_node> content;
     std::string text;
   };
}

BOOST_FUSION_ADAPT_STRUCT(
     tmp_view::tmp,
     (std::vector<tmp_view::tmp_node>, content)
     (std::string,text)
)

谁能解释为什么会这样?也许谁知道类似的解析器在boost :: spirit上写过?

1 个答案:

答案 0 :(得分:2)

只是猜测你根本不想解析XML,而是某种用于分层文本的混合内容标记语言,我会这样做

        simple = +~qi::char_("><");
        nested = '<' >> *soup >> '>';
        soup   = nested|simple;

将AST /规则定义为

typedef boost::make_recursive_variant<
        boost::variant<std::string, std::vector<boost::recursive_variant_> > 
    >::type tag_soup;

qi::rule<It, std::string()>           simple;
qi::rule<It, std::vector<tag_soup>()> nested;
qi::rule<It, tag_soup()>              soup;

查看 Live On Coliru

////  #define BOOST_SPIRIT_DEBUG
#include <boost/spirit/include/qi.hpp>
#include <boost/variant/recursive_variant.hpp>

#include <iostream>
#include <fstream>

namespace client
{
    typedef boost::make_recursive_variant<
            boost::variant<std::string, std::vector<boost::recursive_variant_> > 
        >::type tag_soup;

    namespace qi = boost::spirit::qi;

    template <typename It>
    struct parser : qi::grammar<It, tag_soup()>
    {
        parser() : parser::base_type(soup)
        {
            simple = +~qi::char_("><");
            nested = '<' >> *soup >> '>';
            soup   = nested|simple;

            BOOST_SPIRIT_DEBUG_NODES((simple)(nested)(soup))
        }
      private:
        qi::rule<It, std::string()>           simple;
        qi::rule<It, std::vector<tag_soup>()> nested;
        qi::rule<It, tag_soup()>              soup;
    };
}

namespace boost { // leverage ADL on variant<>
    static std::ostream& operator<<(std::ostream& os, std::vector<client::tag_soup> const& soup)
    {
        os << "<";
        std::copy(soup.begin(), soup.end(), std::ostream_iterator<client::tag_soup>(os));
        return os << ">";
    }
}

int main(int argc, char **argv)
{
    if (argc < 2) {
        std::cerr << "Error: No input file provided.\n";
        return 1;
    }

    std::ifstream in(argv[1]);
    std::string const storage(std::istreambuf_iterator<char>(in), {}); // We will read the contents here.

    if (!(in || in.eof())) {
        std::cerr << "Error: Could not read from input file\n";
        return 1;
    }

    static const client::parser<std::string::const_iterator> p;

    client::tag_soup ast; // Our tree
    bool ok = parse(storage.begin(), storage.end(), p, ast);

    if (ok) std::cout << "Parsing succeeded\nData: " << ast << "\n";
    else    std::cout << "Parsing failed\n";

    return ok? 0 : 1;
}

如果您定义BOOST_SPIRIT_DEBUG,您将获得解析过程的详细输出。

输入

<some text with nested <tags <etc...> >more text>

打印

Parsing succeeded
Data: <some text with nested <tags <etc...> >more text>

请注意,输出是从变体打印的,而不是原始文本。