Question

我不熟悉PEG解析并尝试编写一个简单的解析器来解析出一个表达式：＆＃34; term1 OR term2 anotherterm＆＃34;理想情况下，进入AST看起来像：

          OR
-----------|---------
|                    |
"term1"            "term2 anotherterm"

我目前正在使用格拉帕（https://github.com/fge/grappa），但它甚至不匹配更基本的表达式＆＃34; term1或term2＆＃34;。这就是我所拥有的：

package grappa;

import com.github.fge.grappa.annotations.Label;
import com.github.fge.grappa.parsers.BaseParser;
import com.github.fge.grappa.rules.Rule;

public class ExprParser extends BaseParser<Object> {

  @Label("expr")
  Rule expr() {
    return sequence(terms(), wsp(), string("OR"), wsp(), terms(), push(match()));
  }

  @Label("terms")
  Rule terms() {
    return sequence(whiteSpaces(),
        join(term()).using(wsp()).min(0),
        whiteSpaces());
  }

  @Label("term")
  Rule term() {
    return sequence(oneOrMore(character()), push(match()));
  }

  Rule character() {
    return anyOf(
        "0123456789" +
        "abcdefghijklmnopqrstuvwxyz" +
        "ABCDEFGHIJKLMNOPQRSTUVWXYZ" +
        "-_");
  }

  @Label("whiteSpaces")
  Rule whiteSpaces() {
    return join(zeroOrMore(wsp())).using(sequence(optional(cr()), lf())).min(0);
  }

}

有人能指出我正确的方向吗？

Answer 1

（格拉巴的作者......）

好的，所以，你似乎想要的实际上是一个解析树。

最近开发了grappa（2.0.x +）的扩展，它可以满足您的需求：https://github.com/ChrisBrenton/grappa-parsetree。

格拉帕默认情况下，只是“盲目地”匹配文本并且有一个堆栈可供使用，所以你可以拥有，例如：

public Rule oneOrOneOrEtc()
{
    return join(one(), push(match())).using(or()).min(1));
}

但是你的所有匹配都会在堆栈中...不太实用，但在某些情况下仍然可用（例如，参见sonar-sslr-grappa）。

在你的情况下你想要这个包。你可以用它来做到这一点：

// define your root node
public final class Root
    extends ParseNode
{
    public Root(final String match, final List<ParseNode> children)
    {
        super(match, children);
    }
}

// define your parse node
public final class Alternative
    extends ParseNode
{
    public Alternative(final String match, final List<ParseNode> children)
    {
        super(match, children);
    }
}

这是最小的实现。然后你的解析器看起来像这样：

@GenerateNode(Alternative.class)
public Rule alternative() // or whatever
{
    return // whatever an alternative is
}

@GenerateNode(Root.class)
public Rule root
{
    return join(alternative())
        .using(or())
        .min(1);
}

这里发生的事情是因为根节点在替代之前匹配，例如，如果你有一个字符串：

a or b or c or d

然后根节点将匹配“整个序列”，它将有四个匹配每个a，b，c和d的替代。

此处的全部学分转到Christopher Brenton，以便首先提出这个想法！

使用Grappa匹配OR表达式（Java PEG Parser）

1 个答案: