学习Boost.Spirit:解析INI

时间:2015-08-20 14:13:24

标签: c++ boost boost-spirit boost-spirit-qi

我开始学习Boost.Spirit并完成阅读Qi - Writing Parsers部分。阅读时,一切都很容易理解。但是当我尝试做某事时,会出现很多错误,因为有太多的包含和命名空间,我需要知道何时包含/使用它们。作为练习,我想编写简单的INI解析器。

这是代码(包括来自Spirit lib中的一个示例,几乎所有其他内容):

#include <boost/config/warning_disable.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix_core.hpp>
#include <boost/spirit/include/phoenix_operator.hpp>
#include <boost/spirit/include/phoenix_stl.hpp>
#include <boost/fusion/adapted/std_pair.hpp>
#include <boost/fusion/include/adapt_struct.hpp>
#include <boost/spirit/include/phoenix_object.hpp>

#include <iostream>
#include <string>
#include <vector>
#include <map>

namespace client
{
    typedef std::map<std::string, std::string> key_value_map_t;

    struct mini_ini
    {
        std::string name;
        key_value_map_t key_values_map;
    };
} // client

BOOST_FUSION_ADAPT_STRUCT(
    client::mini_ini,
    (std::string, name)
    (client::key_value_map_t, key_values_map)
)

namespace client
{
    namespace qi = boost::spirit::qi;
    namespace ascii = boost::spirit::ascii;
    namespace phoenix = boost::phoenix;

    template <typename Iterator>
    struct ini_grammar : qi::grammar<Iterator, mini_ini(), ascii::space_type>
    {
        ini_grammar() : ini_grammar::base_type(section_, "section")
        {
            using qi::char_;
            using qi::on_error;
            using qi::fail;
            using namespace qi::labels;
            using phoenix::construct;
            using phoenix::val;

            key_ = +char_("a-zA-Z_0-9");
            pair_ = key_ >> '=' >> *char_;
            section_ = '[' >> key_ >> ']' >> '\n' >> *(pair_ >> '\n');

            key_.name("key");
            pair_.name("pair");
            section_.name("section");

            on_error<fail>
            (
                section_
              , std::cout
                    << val("Error! Expecting ")
                    << _4                               // what failed?
                    << val(" here: \"")
                    << construct<std::string>(_3, _2)   // iterators to error-pos, end
                    << val("\"")
                    << std::endl
            );
        }

        qi::rule<Iterator, std::string(), ascii::space_type> key_;
        qi::rule<Iterator, mini_ini(), ascii::space_type> section_;
        qi::rule<Iterator, std::pair<std::string, std::string>(), ascii::space_type> pair_;
    };
} // client

int
main()
{
    std::string storage =
        "[section]\n"
        "key1=val1\n"
        "key2=val2\n";
    client::mini_ini ini;
    typedef client::ini_grammar<std::string::const_iterator> ini_grammar;
    ini_grammar grammar;

    using boost::spirit::ascii::space;
    std::string::const_iterator iter = storage.begin();
    std::string::const_iterator end = storage.end();
    bool r = phrase_parse(iter, end, grammar, space, ini);

    if (r && iter == end)
    {
        std::cout << "-------------------------\n";
        std::cout << "Parsing succeeded\n";
        std::cout << "-------------------------\n";

        return 0;
    }
    else
    {
        std::cout << "-------------------------\n";
        std::cout << "Parsing failed\n";
        std::cout << "-------------------------\n";
        std::cout << std::string(iter, end) << "\n";
        return 1;
    }

    return 0;
}

正如您所见,我想将下一个文本解析为mini_ini struct:

"[section]"
"key1=val1"
"key2=val2";

我有失败,std::string(iter, end)是完整的输入字符串。

我的问题:

  • 为什么我看到失败但看不到on_error<fail>处理程序?
  • 你有什么建议如何学习Boost.Spirit(我在理论上对文档有很好的理解,但在实践中我有很多为什么?)?

由于

1 个答案:

答案 0 :(得分:2)

  

问。为什么我看到失败但没有看到on_error处理程序

on_error处理程序仅针对已注册的规则(section_)和expectation point is failed触发。

您的语法不包含期望点(仅使用>>,而不是>)。

  

问。你有什么建议如何学习Boost.Spirit(我在理论上对文档有很好的理解,但在实践中我有很多为什么?)

只需构建您需要的解析器。从文档和SO答案中复制好的约定。有很多。正如您所看到的,相当多的数据都包含Ini解析器的完整示例,其中包含不同级别的错误报告。

奖金提示:

做更详细的状态报告:

bool ok = phrase_parse(iter, end, grammar, space, ini);

if (ok) {
    std::cout << "Parse success\n";
} else {
    std::cout << "Parse failure\n";
}

if (iter != end) {
    std::cout << "Remaining unparsed: '" << std::string(iter, end) << "'\n";
}

return ok && (iter==end)? 0 : 1;

使用BOOST_SPIRIT_DEBUG:

#define BOOST_SPIRIT_DEBUG

// and later
BOOST_SPIRIT_DEBUG_NODES((key_)(pair_)(section_))

打印:

<section_>
  <try>[section]\nkey1=val1\n</try>
  <key_>
    <try>section]\nkey1=val1\nk</try>
    <success>]\nkey1=val1\nkey2=val</success>
    <attributes>[[s, e, c, t, i, o, n]]</attributes>
  </key_>
  <fail/>
</section_>
Parse failure
Remaining unparsed: '[section]
key1=val1
key2=val2
'

您会注意到部分标题未被解析,因为换行符不匹配。您的船长(space_type 跳过 换行符,因此永远不会匹配:Boost spirit skipper issues

修复船长

使用blank_type作为船长时,您将获得成功的解析:

<section_>
<try>[section]\nkey1=val1\n</try>
<key_>
    <try>section]\nkey1=val1\nk</try>
    <success>]\nkey1=val1\nkey2=val</success>
    <attributes>[[s, e, c, t, i, o, n]]</attributes>
</key_>
<pair_>
    <try>key1=val1\nkey2=val2\n</try>
    <key_>
    <try>key1=val1\nkey2=val2\n</try>
    <success>=val1\nkey2=val2\n</success>
    <attributes>[[k, e, y, 1]]</attributes>
    </key_>
    <success></success>
    <attributes>[[[k, e, y, 1], [v, a, l, 1, 
, k, e, y, 2, =, v, a, l, 2, 
]]]</attributes>
</pair_>
<success>key1=val1\nkey2=val2\n</success>
<attributes>[[[s, e, c, t, i, o, n], []]]</attributes>
</section_>
Parse success
Remaining unparsed: 'key1=val1
key2=val2
  

注意:解析成功,但没有做你想做的事。这是因为*char_包含换行符。那就做那个

       pair_ = key_ >> '=' >> *(char_ - qi::eol); // or
       pair_ = key_ >> '=' >> *~char_("\r\n"); // etc

完整代码

<强> Live On Coliru

#define BOOST_SPIRIT_DEBUG
#include <boost/config/warning_disable.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix_core.hpp>
#include <boost/spirit/include/phoenix_operator.hpp>
#include <boost/spirit/include/phoenix_stl.hpp>
#include <boost/fusion/adapted/std_pair.hpp>
#include <boost/fusion/include/adapt_struct.hpp>
#include <boost/spirit/include/phoenix_object.hpp>

#include <iostream>
#include <string>
#include <vector>
#include <map>

namespace client
{
    typedef std::map<std::string, std::string> key_value_map_t;

    struct mini_ini
    {
        std::string name;
        key_value_map_t key_values_map;
    };
} // client

BOOST_FUSION_ADAPT_STRUCT(
    client::mini_ini,
    (std::string, name)
    (client::key_value_map_t, key_values_map)
)

namespace client
{
    namespace qi      = boost::spirit::qi;
    namespace ascii   = boost::spirit::ascii;
    namespace phoenix = boost::phoenix;

    template <typename Iterator>
    struct ini_grammar : qi::grammar<Iterator, mini_ini(), ascii::blank_type>
    {
        ini_grammar() : ini_grammar::base_type(section_, "section")
        {
            using qi::char_;
            using qi::on_error;
            using qi::fail;
            using namespace qi::labels;
            using phoenix::construct;
            using phoenix::val;

            key_ = +char_("a-zA-Z_0-9");
            pair_ = key_ >> '=' >> *char_;
            section_ = '[' >> key_ >> ']' >> '\n' >> *(pair_ >> '\n');

            BOOST_SPIRIT_DEBUG_NODES((key_)(pair_)(section_))

            on_error<fail>
            (
                section_
              , std::cout
                    << val("Error! Expecting ")
                    << _4                               // what failed?
                    << val(" here: \"")
                    << construct<std::string>(_3, _2)   // iterators to error-pos, end
                    << val("\"")
                    << std::endl
            );
        }

        qi::rule<Iterator, std::string(), ascii::blank_type> key_;
        qi::rule<Iterator, mini_ini(), ascii::blank_type> section_;
        qi::rule<Iterator, std::pair<std::string, std::string>(), ascii::blank_type> pair_;
    };
} // client

int
main()
{
    std::string storage =
        "[section]\n"
        "key1=val1\n"
        "key2=val2\n";
    client::mini_ini ini;
    typedef client::ini_grammar<std::string::const_iterator> ini_grammar;
    ini_grammar grammar;

    using boost::spirit::ascii::blank;
    std::string::const_iterator iter = storage.begin();
    std::string::const_iterator end = storage.end();
    bool ok = phrase_parse(iter, end, grammar, blank, ini);

    if (ok) {
        std::cout << "Parse success\n";
    } else {
        std::cout << "Parse failure\n";
    }

    if (iter != end) {
        std::cout << "Remaining unparsed: '" << std::string(iter, end) << "'\n";
    }

    return ok && (iter==end)? 0 : 1;
}