在阅读error handling上的Spirit X3教程和一些实验之后。我被得出结论了。
我相信X3中的错误处理主题还有一些改进的余地。在我看来,一个重要的目标是提供有意义的错误消息。首先,最重要的是添加一个将_pass(ctx)
成员设置为false的语义动作不会这样做,因为X3会尝试匹配其他内容。仅抛出x3::expectation_failure
会过早退出解析功能,即不尝试匹配其他任何内容。因此,剩下的就是解析器指令expect[a]
和解析器operator>
以及从语义动作中手动抛出x3::expectation_failure
的过程。我确实相信有关此错误处理的词汇太有限。请考虑以下X3 PEG语法行:
const auto a = a1 >> a2 >> a3;
const auto b = b1 >> b2 >> b3;
const auto c = c1 >> c2 >> c3;
const auto main_rule__def =
(
a |
b |
c );
现在对于表达式a
,我不能使用expect[]
或operator>
,因为其他替代方法可能是有效的。我可能是错的,但是我认为X3要求我拼写出可以匹配的替代错误表达式,如果匹配,它们可能会抛出x3::expectation_failure
,这很麻烦。
问题是,是否有一种很好的方法来检查我的PEG构造中的错误情况,并使用当前的X3工具检查a,b和c的有序替代项?
如果答案是否定的,我想提出我的想法,以为此提供合理的解决方案。我相信我需要一个新的解析器指令。该指令应该做什么?解析失败时,它应调用附加的语义操作。该属性显然是未使用的,但是在第一次出现解析不匹配时,我需要在迭代器位置设置_where
成员。因此,如果a2
失败,则应在_where
结束后将a1
设置为1。我们将其称为解析指令neg_sa
。这意味着否定语义动作。
pseudocode
// semantic actions
auto a_sa = [&](auto& ctx)
{
// add _where to vector v
};
auto b_sa = [&](auto& ctx)
{
// add _where to vector v
};
auto c_sa = [&](auto& ctx)
{
// add _where to vector v
// now we know we have a *real* error.
// find the peak iterator value in the vector v
// the position tells whether it belongs to a, b or c.
// now we can formulate an error message like: “cannot make sense of b upto this position.”
// lastly throw x3::expectation_failure
};
// PEG
const auto a = a1 >> a2 >> a3;
const auto b = b1 >> b2 >> b3;
const auto c = c1 >> c2 >> c3;
const auto main_rule__def =
(
neg_sa[a][a_sa] |
neg_sa[b][b_sa] |
neg_sa[c][c_sa] );
我希望我清楚地提出了这个想法。如果需要进一步说明,请在评论部分告诉我。
答案 0 :(得分:1)
好吧,冒着在一个示例中混淆太多内容的风险,
namespace square::peg {
using namespace x3;
const auto quoted_string = lexeme['"' > *(print - '"') > '"'];
const auto bare_string = lexeme[alpha > *alnum] > ';';
const auto two_ints = int_ > int_;
const auto main = quoted_string | bare_string | two_ints;
const auto entry_point = skip(space)[ expect[main] > eoi ];
} // namespace square::peg
那应该做。关键是,唯一应该期待的事情 点是使各个分支失败的东西 无疑是正确的分支。 (否则,实际上不会有 很难期望)。
通过两个较小的get_info
专长用于更漂亮的消息¹,这可能会导致
甚至当手动捕获异常时也可以提供体面的错误消息:
int main() {
using It = std::string::const_iterator;
for (std::string const input : {
" -89 0038 ",
" \"-89 0038\" ",
" something123123 ;",
// undecidable
"",
// violate expecations, no successful parse
" -89 oops ", // not an integer
" \"-89 0038 ", // missing "
" bareword ", // missing ;
// trailing debris, successful "main"
" -89 3.14 ", // followed by .14
})
{
std::cout << "====== " << std::quoted(input) << "\n";
It iter = input.begin(), end = input.end();
try {
if (parse(iter, end, square::peg::entry_point)) {
std::cout << "Parsed successfully\n";
} else {
std::cout << "Parsing failed\n";
}
} catch (x3::expectation_failure<It> const& ef) {
auto pos = std::distance(input.begin(), ef.where());
std::cout << "Expect " << ef.which() << " at "
<< "\n\t" << input
<< "\n\t" << std::setw(pos) << std::setfill('-') << "" << "^\n";
}
}
}
打印
====== " -89 0038 "
Parsed successfully
====== " \"-89 0038\" "
Parsed successfully
====== " something123123 ;"
Parsed successfully
====== ""
Expect quoted string, bare string or integer number pair at
^
====== " -89 oops "
Expect integral number at
-89 oops
-------^
====== " \"-89 0038 "
Expect '"' at
"-89 0038
--------------^
====== " bareword "
Expect ';' at
bareword
------------^
====== " -89 3.14 "
Expect eoi at
-89 3.14
--------^
这已经超出了大多数人对解析器的期望。
我们可能不仅仅满足于期望并提供帮助。确实,您可以报告并继续解析,因为通常存在不匹配的情况:这是on_error
出现的地方。
让我们创建一个标签库:
struct with_error_handling {
template<typename It, typename Ctx>
x3::error_handler_result on_error(It f, It l, expectation_failure<It> const& ef, Ctx const&) const {
std::string s(f,l);
auto pos = std::distance(f, ef.where());
std::cout << "Expecting " << ef.which() << " at "
<< "\n\t" << s
<< "\n\t" << std::setw(pos) << std::setfill('-') << "" << "^\n";
return error_handler_result::fail;
}
};
现在,我们要做的就是从with_error_handling
和BAM!中获得规则ID,我们不必编写任何异常处理程序,规则将通过适当的诊断简单地“失败”。更重要的是,某些输入可能会导致多种(非常有用的)诊断:
auto const eh = [](auto p) {
struct _ : with_error_handling {};
return rule<_> {} = p;
};
const auto quoted_string = eh(lexeme['"' > *(print - '"') > '"']);
const auto bare_string = eh(lexeme[alpha > *alnum] > ';');
const auto two_ints = eh(int_ > int_);
const auto main = quoted_string | bare_string | two_ints;
using main_type = std::remove_cv_t<decltype(main)>;
const auto entry_point = skip(space)[ eh(expect[main] > eoi) ];
现在,main
变为:
for (std::string const input : {
" -89 0038 ",
" \"-89 0038\" ",
" something123123 ;",
// undecidable
"",
// violate expecations, no successful parse
" -89 oops ", // not an integer
" \"-89 0038 ", // missing "
" bareword ", // missing ;
// trailing debris, successful "main"
" -89 3.14 ", // followed by .14
})
{
std::cout << "====== " << std::quoted(input) << "\n";
It iter = input.begin(), end = input.end();
if (parse(iter, end, square::peg::entry_point)) {
std::cout << "Parsed successfully\n";
} else {
std::cout << "Parsing failed\n";
}
}
程序将打印:
====== " -89 0038 "
Parsed successfully
====== " \"-89 0038\" "
Parsed successfully
====== " something123123 ;"
Parsed successfully
====== ""
Expecting quoted string, bare string or integer number pair at
^
Parsing failed
====== " -89 oops "
Expecting integral number at
-89 oops
-------^
Expecting quoted string, bare string or integer number pair at
-89 oops
^
Parsing failed
====== " \"-89 0038 "
Expecting '"' at
"-89 0038
--------------^
Expecting quoted string, bare string or integer number pair at
"-89 0038
^
Parsing failed
====== " bareword "
Expecting ';' at
bareword
------------^
Expecting quoted string, bare string or integer number pair at
bareword
^
Parsing failed
====== " -89 3.14 "
Expecting eoi at
-89 3.14
--------^
Parsing failed
on_success
当解析器实际上不解析任何内容时,它们并不是很有用,所以让我们添加一些建设性的价值处理,同时展示on_success
:
定义一些AST类型以接收属性:
struct quoted : std::string {};
struct bare : std::string {};
using two_i = std::pair<int, int>;
using Value = boost::variant<quoted, bare, two_i>;
确保我们可以打印Value
:
static inline std::ostream& operator<<(std::ostream& os, Value const& v) {
struct {
std::ostream& _os;
void operator()(quoted const& v) const { _os << "quoted(" << std::quoted(v) << ")"; }
void operator()(bare const& v) const { _os << "bare(" << v << ")"; }
void operator()(two_i const& v) const { _os << "two_i(" << v.first << ", " << v.second << ")"; }
} vis{os};
boost::apply_visitor(vis, v);
return os;
}
现在,使用旧的as<>
技巧来强制属性类型,这次使用错误处理:
锦上添花,让我们在on_success
中演示with_error_handling
:
template<typename It, typename Ctx>
void on_success(It f, It l, two_i const& v, Ctx const&) const {
std::cout << "Parsed " << std::quoted(std::string(f,l)) << " as integer pair " << v.first << ", " << v.second << "\n";
}
现在具有很大程度上未修改的主程序(也只打印结果值):
It iter = input.begin(), end = input.end();
Value v;
if (parse(iter, end, square::peg::entry_point, v)) {
std::cout << "Result value: " << v << "\n";
} else {
std::cout << "Parsing failed\n";
}
打印
====== " -89 0038 "
Parsed "-89 0038" as integer pair -89, 38
Result value: two_i(-89, 38)
====== " \"-89 0038\" "
Result value: quoted("-89 0038")
====== " something123123 ;"
Result value: bare(something123123)
====== ""
Expecting quoted string, bare string or integer number pair at
^
Parsing failed
====== " -89 oops "
Expecting integral number at
-89 oops
-------^
Expecting quoted string, bare string or integer number pair at
-89 oops
^
Parsing failed
====== " \"-89 0038 "
Expecting '"' at
"-89 0038
--------------^
Expecting quoted string, bare string or integer number pair at
"-89 0038
^
Parsing failed
====== " bareword "
Expecting ';' at
bareword
------------^
Expecting quoted string, bare string or integer number pair at
bareword
^
Parsing failed
====== " -89 3.14 "
Parsed "-89 3" as integer pair -89, 3
Expecting eoi at
-89 3.14
--------^
Parsing failed
我不了解您,但是我讨厌产生副作用,更不用说从解析器打印到控制台了。让我们改用x3::with
。
我们想通过Ctx&
参数附加到诊断程序,而不是编写
到std::cout
处理程序中的on_error
:
struct with_error_handling {
struct diags;
template<typename It, typename Ctx>
x3::error_handler_result on_error(It f, It l, expectation_failure<It> const& ef, Ctx const& ctx) const {
std::string s(f,l);
auto pos = std::distance(f, ef.where());
std::ostringstream oss;
oss << "Expecting " << ef.which() << " at "
<< "\n\t" << s
<< "\n\t" << std::setw(pos) << std::setfill('-') << "" << "^";
x3::get<diags>(ctx).push_back(oss.str());
return error_handler_result::fail;
}
};
在呼叫站点上,我们可以传递上下文:
std::vector<std::string> diags;
if (parse(iter, end, x3::with<D>(diags) [square::peg::entry_point], v)) {
std::cout << "Result value: " << v;
} else {
std::cout << "Parsing failed";
}
std::cout << " with " << diags.size() << " diagnostics messages: \n";
完整程序还会打印诊断:
//#define BOOST_SPIRIT_X3_DEBUG
#include <boost/fusion/adapted.hpp>
#include <boost/spirit/home/x3.hpp>
#include <iostream>
#include <iomanip>
namespace x3 = boost::spirit::x3;
struct quoted : std::string {};
struct bare : std::string {};
using two_i = std::pair<int, int>;
using Value = boost::variant<quoted, bare, two_i>;
static inline std::ostream& operator<<(std::ostream& os, Value const& v) {
struct {
std::ostream& _os;
void operator()(quoted const& v) const { _os << "quoted(" << std::quoted(v) << ")"; }
void operator()(bare const& v) const { _os << "bare(" << v << ")"; }
void operator()(two_i const& v) const { _os << "two_i(" << v.first << ", " << v.second << ")"; }
} vis{os};
boost::apply_visitor(vis, v);
return os;
}
namespace square::peg {
using namespace x3;
struct with_error_handling {
struct diags;
template<typename It, typename Ctx>
x3::error_handler_result on_error(It f, It l, expectation_failure<It> const& ef, Ctx const& ctx) const {
std::string s(f,l);
auto pos = std::distance(f, ef.where());
std::ostringstream oss;
oss << "Expecting " << ef.which() << " at "
<< "\n\t" << s
<< "\n\t" << std::setw(pos) << std::setfill('-') << "" << "^";
x3::get<diags>(ctx).push_back(oss.str());
return error_handler_result::fail;
}
};
template <typename T = x3::unused_type> auto const as = [](auto p) {
struct _ : with_error_handling {};
return rule<_, T> {} = p;
};
const auto quoted_string = as<quoted>(lexeme['"' > *(print - '"') > '"']);
const auto bare_string = as<bare>(lexeme[alpha > *alnum] > ';');
const auto two_ints = as<two_i>(int_ > int_);
const auto main = quoted_string | bare_string | two_ints;
using main_type = std::remove_cv_t<decltype(main)>;
const auto entry_point = skip(space)[ as<Value>(expect[main] > eoi) ];
} // namespace square::peg
namespace boost::spirit::x3 {
template <> struct get_info<int_type> {
typedef std::string result_type;
std::string operator()(int_type const&) const { return "integral number"; }
};
template <> struct get_info<square::peg::main_type> {
typedef std::string result_type;
std::string operator()(square::peg::main_type const&) const { return "quoted string, bare string or integer number pair"; }
};
}
int main() {
using It = std::string::const_iterator;
using D = square::peg::with_error_handling::diags;
for (std::string const input : {
" -89 0038 ",
" \"-89 0038\" ",
" something123123 ;",
// undecidable
"",
// violate expecations, no successful parse
" -89 oops ", // not an integer
" \"-89 0038 ", // missing "
" bareword ", // missing ;
// trailing debris, successful "main"
" -89 3.14 ", // followed by .14
})
{
std::cout << "====== " << std::quoted(input) << "\n";
It iter = input.begin(), end = input.end();
Value v;
std::vector<std::string> diags;
if (parse(iter, end, x3::with<D>(diags) [square::peg::entry_point], v)) {
std::cout << "Result value: " << v;
} else {
std::cout << "Parsing failed";
}
std::cout << " with " << diags.size() << " diagnostics messages: \n";
for(auto& msg: diags) {
std::cout << " - " << msg << "\n";
}
}
}
¹您可以改用规则的名称,从而避免使用更复杂的技巧
²在旧版本的库中,您可能需要为获得with<>
数据上的引用语义而奋斗: Live On Coliru
答案 1 :(得分:0)
现在用于表达式a我不能使用Expect []或operator>,因为其他替代方法可能是有效的。我可能是错的,但我认为X3要求我拼写出可以匹配的替代错误表达式,如果匹配,它们可能会抛出x3 :: expectation_failure,这很麻烦。
很简单:
const auto main_rule__def = x3::expect [
a |
b |
c ];
或者,甚至:
const auto main_rule__def = x3::eps > (
a |
b |
c );
如果答案是否定的,我想提出我的想法,以为此提供合理的解决方案。我相信我需要一个新的解析器指令。该指令应该做什么? 当解析失败时,它应该调用附加的语义动作。
现有的x3 :: on_error功能已经知道如何执行此操作。请注意:这有点复杂,但基于同样的优点,它也相当灵活。
基本上,您需要在ID类型上实现静态接口(x3::rule<ID, Attr>
,在您选择的约定中可能为main_rule_class
)。存储库中有一些编译器示例,展示了如何使用它。
旁注:同时有
on_success
和on_error
使用此范例
将使用参数on_error
在ID类型的默认构造副本上调用ID().on_error(first, last, expectation_failure_object, context)
成员。
const auto main_rule__def = ( neg_sa[a][a_sa] | neg_sa[b][b_sa] | neg_sa[c][c_sa] );
说实话,我认为您正在为自己的困惑铺平道路。您有3个单独的错误操作对您有什么好处?您如何确定发生哪个错误?
真的只有两种可能性:
a
,{{ 1}}或b
)。或者您不知道隐含了哪个分支(例如,什么时候分支可以以相似的产品开始而在它们内部失败)。在那种情况下,没有人能告诉应该调用哪个错误处理程序 ,因此要点不止一个。
实际上,正确的做法是使更高级别的c
失败,这意味着“没有可能的分支成功”。
这是main_rule
的处理方式。