使用boost :: spirit phrase_parse解析字符串

时间:2012-08-23 09:00:31

标签: c++ boost-spirit

我想解析字符串

    std::string entry = "127.0.0.1 - [16/Aug/2012:01:50:02 +0000] \"GET /check.htm HTTP/1.1\" 200 17 \"AgentName/0.1 libwww-perl/5.833\""

具有以下内容:

    ip_rule %= lexeme[(+char_("0-9."))[ref(ip) = _1]];
    timestamp_rule %= lexeme[('[' >> +(char_ - ']') >> ']')[ref(timestamp) = _1]];
    user_rule %= lexeme[(+char_)[ref(user) = _1]];
    request_rule %= lexeme[('"' >> +(char_ - '"') >> '"')[ref(req) = _1]];
    referer_rule %= lexeme[('"' >> +(char_ - '"') >> '"')[ref(referer) = _1]];

    bool r = phrase_parse(first, last,
    ip_rule >> user_rule >> timestamp_rule >> request_rule >> uint_[ref(status) = _1]
    >> uint_[ref(transferred_bytes) = _1] >> referer_rule, space);

但它不匹配。如果我从字符串中删除“ - ”,并且规则“user_rule”当然比它匹配。你能否告诉我如何将字符串与“ - ”匹配?

1 个答案:

答案 0 :(得分:3)

您的user_rule“吃掉了”其他文字。像这样定义它:+~qi::char_("[")),以便它停在'['个字符处。 以下代码按预期工作:

#include <boost/spirit/include/qi.hpp> 
using namespace boost::spirit::qi;

int main()
{
    std::string ip, user, timestamp, req, referer;
    unsigned status, transferred_bytes;
    std::string entry = "127.0.0.1 - [16/Aug/2012:01:50:02 +0000] \"GET /check.htm HTTP/1.1\" 200 17 \"AgentName/0.1 libwww-perl/5.833\"";
    bool r = phrase_parse(entry.begin(), entry.end(), 
    lexeme[+char_("0-9.")] >> 
        +~char_("[") >> 
        lexeme[('[' >> +~char_("]") >> ']')] >> 
        lexeme[('"' >> +~char_("\"") >> '"')] >> 
        uint_ >> 
        uint_ >> 
        lexeme[('"' >> +~char_("\"") >> '"')], space, ip, user, timestamp, req, status, transferred_bytes, referer); 

}