我正在尝试使用特殊规则解析URL查询字符串。到目前为止,它适用于下面描述的一个排除 使用以下内容将URL解析为键值对集合:
const qi::rule<std::string::const_iterator, std::string()> key = qi::char_("a-zA-Z_") >> *qi::char_("a-zA-Z_0-9/%\\-_~\\.");
const qi::rule<std::string::const_iterator, std::string()> value = *(qi::char_ - '=' - '&');
const qi::rule<std::string::const_iterator, std::pair<std::string, std::string>()> pair = key >> -('=' >> value);
const qi::rule<std::string::const_iterator, std::unordered_map<std::string, std::string>()> query = pair >> *(('&') >> pair);
到目前为止,这么好。其中一个特殊情况是,&符号可以以XML实体的形式呈现 - &amp;所以查询规则已升级到
const qi::rule<std::string::const_iterator, std::unordered_map<std::string, std::string>()> query = pair >> *((qi::lit("&")|'&') >> pair);
它按预期工作。然后出现了另外的特殊情况 - 引用的值可以包含未转义的等号和符号,形式为a = b&amp; d = e&amp; f = $$ g = h&amp; i = j $$&amp; x = y&amp; z =高清 哪个应解析成
所以我为“引用的”值添加了额外的规则
const qi::rule<std::string::const_iterator, std::string()> key = qi::char_("a-zA-Z_") >> *qi::char_("a-zA-Z_0-9/%\\-_~\\.");
const qi::rule<std::string::const_iterator, std::string()> escapedValue = qi::omit["$$"] >> *(qi::char_ - '$') >> qi::omit["$$"];
const qi::rule<std::string::const_iterator, std::string()> value = *(escapedValue | (qi::char_ - '=' - '&'));
const qi::rule<std::string::const_iterator, std::pair<std::string, std::string>()> pair = key >> -('=' >> value);
const qi::rule<std::string::const_iterator, std::unordered_map<std::string, std::string>()> query = pair >> *((qi::lit("&")|'&') >> pair);
,再次按预期工作直到下一个案例 - a = b&amp; d = e&amp; f = $$ g = h&amp; i = j $$ x = y&amp; z = def,注意,没有&符号在关闭“$$”和下一个关键名称之间。通过添加像
这样的kleene运算符,可以很容易地解决它const qi::rule<std::string::const_iterator, std::unordered_map<std::string, std::string>()> query = pair >> *(__*__(qi::lit("&")|'&') >> pair);
但由于某种原因,它没有做到这一点。任何建议将不胜感激!
编辑: 示例代码
#include <boost/spirit/include/qi.hpp>
#include <boost/fusion/adapted/std_pair.hpp>
#include <unordered_map>
namespace rulez
{
using namespace boost::spirit::qi;
using It = std::string::const_iterator;
const rule<It, std::string()> key = boost::spirit::qi::char_("a-zA-Z_") >> *boost::spirit::qi::char_("a-zA-Z_0-9/%\\-_~\\.");
const rule<It, std::string()> escapedValue = boost::spirit::qi::omit["$$"] >> *(boost::spirit::qi::char_ - '$') >> boost::spirit::qi::omit["$$"];
const rule<It, std::string()> value = *(escapedValue | (boost::spirit::qi::char_ - '=' - '&'));
const rule<It, std::pair<std::string, std::string>()> pair = key >> -('=' >> value);
const rule<It, std::unordered_map<std::string, std::string>()> query = pair >> *(*(boost::spirit::qi::lit("&")|'&') >> pair);
}
int main()
{
using namespace std;
unordered_map<string, string> keyVal;
//string const paramString = "a=b&d=e&f=$$g=h&i=j$$&x=y&z=def";
string const paramString = "a=b&d=e&f=$$g=h&i=j$$x=y&z=def";
boost::spirit::qi::parse(paramString.begin(), paramString.end(), rulez::query, keyVal);
for (const auto& pair : keyVal)
cout << "(\"" << pair.first << "\",\"" << pair.second << "\")" << endl;
}
“a = b&amp; d = e&amp; f = $$ g = h&amp; i = j $$ x = y&amp; z = def”的输出(错误,应与“a = b&amp; d”相同= E&安培; F = $$ G = H&安培;我= j的$$&安培; X = Y&安培; Z = DEF“)
(“a”,“b”),(“d”,“e”),(“f”,“g = h&amp; i = jx”)
输出“a = b&amp; d = e&amp; f = $$ g = h&amp; i = j $$&amp; x = y&amp; z = def”(正如预期的那样)
(“a”,“b”),(“d”,“e”),(“f”,“g = h&amp; i = j”),(“x”,“y”),( “z”,“def”)
编辑: 一些更简单的解析规则,只是为了让事情更容易理解
namespace rulez
{
const rule<std::string::const_iterator, std::string()> key = +(char_ - '&' - '=');
const rule<std::string::const_iterator, std::string()> escapedValue = omit["$$"] >> *(char_ - '$') >> omit["$$"];
const rule<std::string::const_iterator, std::string()> value = *(escapedValue | (char_ - '&' - '='));
const rule<std::string::const_iterator, pair<std::string, std::string>()> pair = key >> -('=' >> value);
const rule<std::string::const_iterator, unordered_map<std::string, std::string>()> query = pair >> *(*(lit('&')) >> pair);
}
答案 0 :(得分:1)
我猜您的问题是value
规则
value = *(escapedValue | (char_ - '&' - '='));
解析时... $$ g = h&amp; i = j $$ x = ...
$$g=h&i=j$$x=
^---------^
它将标记的字符串$$g=h&i=j$$
解析为escapedValue
,然后kleene运算符(*)允许(char_ - '&' - '=')
规则的第二部分value
解析{{1} }}
x
并且仅在$$g=h&i=j$$x=
^
规则停止。
也许这样的事情会有所帮助:
=
答案 1 :(得分:0)
这解决了这个问题。但是,我决定放弃使用精灵来解析查询字符串的想法 - 每个特殊情况都会使查询越来越麻烦,过了一段时间没人会记住为什么查询是按原样写的:)
qi::rule<std::string::const_iterator, std::string()> key = +(qi::char_ - '=' - '&');
qi::rule<std::string::const_iterator, std::string()> escapedValue = qi::omit["$$"] >> *(qi::char_ - "$$") >> qi::omit["$$"];
qi::rule<std::string::const_iterator, std::string()> nonEscapedValue = !qi::lit("$$") >> *(qi::char_ - '=' - '&');
auto sep = qi::lit("&") | '&';
qi::rule<std::string::const_iterator, std::pair<std::string, boost::optional<std::string>>()> keyValue =
key >> -('=' >> nonEscapedValue) >> (sep | qi::eoi);
qi::rule<std::string::const_iterator, std::pair<std::string, boost::optional<std::string>>()> escapedKeyValue =
key >> '=' >> escapedValue >> -(sep);
auto query = *(qi::hold[keyValue] | escapedKeyValue);