Lex/Parse 错误收集（容易做）

Question

我正在尝试从 C++ 迁移一个 antlr 项目。语法和代码生成大部分已经完成（基于 65038949 中提供的解决方案），但有一个待处理的项目是在 go 中编写自定义错误报告器。

我正在寻找用于这些目的的自定义错误报告器：

我想打印我的自定义消息，可能带有额外信息（例如文件名，默认错误打印机不会打印）。
对于每个错误，错误报告器都会更新一个全局计数器，并且在主程序中如果这个 error_count>0 则跳过进一步的处理。

这里是在 c++ 项目中所做的：

在此函数中定义了一条自定义消息：

string MyErrorMessage(unsigned int l, unsigned int p, string m) {
    stringstream s;
    s << "ERR: line " << l << "::" << p << " " << m;
    global->errors++;
    return s.str();
}

并且 antlr 运行时 (ConsoleErrorListener.cpp) 已更新为调用上述函数：

void ConsoleErrorListener::syntaxError(IRecognizer *, Token * ,
  size_t line, size_t charPositionInLine, const std::string &msg, std::exception_ptr)  {
  std::cerr << MyErrorMessage(line, charPositionInLine, msg) << std::endl;
}

最后，主程序会像这样跳过进一步的处理：

parser.top_rule();
if(global->errors > 0) {
    exit(0);
}

如何为antlr的go目标重写这些c++代码？

浏览 antlr 运行时代码（来自 github.com/antlr/antlr4/runtime/Go/antlr）后的一些附加说明：

parser.go 有一个变量“_SyntaxErrors”，它在每次出错时递增，但似乎没有人使用它。这个变量的用途是什么，解析后如何使用它来检查是否发生了任何错误？我做了以下，但显然没有奏效！（解决方法是在解析器中添加一个新变量 MyErrorCount，并在 _SyntaxErrors 也增加时增加它，但这看起来不是一个优雅的解决方案，因为我正在编辑运行时代码！）
```
tree := parser.Top_rule() // this is ok
fmt.Printf("errors=%d\n", parser._SyntaxErrors) // this gives a compiler error
//fmt.Printf("errors=%d\n", parser.MyErrorCount) // this is ok
```
在上面的注释中，我在 antlr 代码中引入了一个新变量，并在用户代码中读取它 - 糟糕的编码风格，但有效。但我也需要做相反的事情 - antlr 错误报告器 (error_listener.go:SyntaxError()) 需要读取具有文件名的用户代码变量并打印它。我可以通过在 antlr 中添加一个新函数来传递这个参数并将这个文件名注册到一个新变量中来实现，但是有没有更好的方法来做到这一点？

Answer 1

Antlr 很棒，但是，需要注意的是，在错误处理方面，它不是 Go 的惯用方法。这使得整个错误过程对于 GoLang 工程师来说是不直观的。

为了在每一步（词法分析、解析、遍历）注入你自己的错误处理，你必须注入带有恐慌的错误监听器/处理程序。 Panic & recovery 很像 Java 异常，我认为这就是为什么它是这样设计的（Antlr 是用 Java 编写的）。

Lex/Parse 错误收集（容易做）

您可以实现任意数量的 ErrorListener。默认使用的是 ConsoleErrorListenerInstance。它所做的只是在 SyntaxErrors 上打印到 stderr，因此我们将其删除。自定义错误报告的第一步是替换它。我做了一个基本的，只收集自定义类型中的错误，我可以在以后使用/报告。

type CustomSyntaxError struct {
    line, column int
    msg          string
}

type CustomErrorListener struct {
    *antlr.DefaultErrorListener // Embed default which ensures we fit the interface
    Errors []error
}

func (c *CustomErrorListener) SyntaxError(recognizer antlr.Recognizer, offendingSymbol interface{}, line, column int, msg string, e antlr.RecognitionException) {
    c.Errors = append(c.Errors, &CustomSyntaxError{
        line:   line,
        column: column,
        msg:    msg,
    })
}

您可以在解析器/词法分析器上注入错误侦听器（同时清除默认侦听器）。

lexerErrors := &CustomErrorListener{}
lexer := NewMyLexer(is)
lexer.RemoveErrorListeners()
lexer.AddErrorListener(lexerErrors)

parserErrors := &CustomErrorListener{}
parser := NewMyParser(stream)
p.removeErrorListeners()
p.AddErrorListener(parserErrors)

当词法分析/解析完成时，两种数据结构都会在词法分析/解析阶段发现语法错误。您可以使用 SyntaxError 中给出的字段。您必须在别处寻找其他接口函数，例如 ReportAmbuiguity。

    if len(lexerErrors.Errors) > 0 {
        fmt.Printf("Lexer %d errors found\n", len(lexerErrors.Errors))
        for _, e := range lexerErrors.Errors {
            fmt.Println("\t", e.Error())
        }
    }

    if len(parserErrors.Errors) > 0 {
        fmt.Printf("Parser %d errors found\n", len(parserErrors.Errors))
        for _, e := range parserErrors.Errors {
            fmt.Println("\t", e.Error())
        }
    }

Lex/解析错误中止（不确定这有多可靠）

警告：这真的让人感觉卡顿。如果只需要收集错误，就按照上面显示的做！

要中止 lex/parse，您必须在错误侦听器中引发恐慌。老实说，我不明白这个设计，但是词法分析/解析代码包含在恐慌恢复中，检查恐慌是否属于 RecognitionException 类型。此异常作为参数传递给您的 ErrorListener，因此请修改 SyntaxError 表达式

func (c *CustomErrorListener) SyntaxError(recognizer antlr.Recognizer, offendingSymbol interface{}, line, column int, msg string, e antlr.RecognitionException) {
  // ...
  panic(e) // Feel free to only panic on certain conditions. This stops parsing/lexing
}

这个恐慌错误被捕获并传递给实现 ErrorStrategy 的 ErrorHandler。我们关心的重要函数是Recover()。 Recover 尝试从错误中恢复，消耗令牌流直到可以找到预期的模式/令牌。由于我们希望它中止，我们可以从 BailErrorStrategy 中获得灵感。这种策略仍然很糟糕，因为它使用恐慌来停止所有工作。您可以简单地省略实现。

type BetterBailErrorStrategy struct {
    *antlr.DefaultErrorStrategy
}

var _ antlr.ErrorStrategy = &BetterBailErrorStrategy{}

func NewBetterBailErrorStrategy() *BetterBailErrorStrategy {

    b := new(BetterBailErrorStrategy)

    b.DefaultErrorStrategy = antlr.NewDefaultErrorStrategy()

    return b
}

func (b *BetterBailErrorStrategy) ReportError(recognizer antlr.Parser, e antlr.RecognitionException) {
    // pass, do nothing
}


func (b *BetterBailErrorStrategy) Recover(recognizer antlr.Parser, e antlr.RecognitionException) {
    // pass, do nothing
}

// Make sure we don't attempt to recover from problems in subrules.//
func (b *BetterBailErrorStrategy) Sync(recognizer antlr.Parser) {
    // pass, do nothing
}

然后添加到解析器

parser.SetErrorHandler(NewBetterBailErrorStrategy())

话虽如此，我建议您只与听众一起收集错误，而不是费心尝试提前中止。 BailErrorStrategy 似乎并没有那么好用，而且在 GoLang 中使用恐慌来恢复感觉很笨拙，很容易搞砸。

如何在 antlr 的 go 目标中编写自定义错误报告器

1 个答案:

Lex/Parse 错误收集（容易做）

Lex/解析错误中止（不确定这有多可靠）