前瞻套装的精确定义是什么?

时间:2010-09-15 19:04:54

标签: compiler-construction compiler-theory

我正在编写编写器并学习语法分析背后的理论。我发现即使它是理解识别算法的关键概念,但网上的信息相当差。似乎StackOverflow处于解决此问题的独特位置。

1 个答案:

答案 0 :(得分:8)

语法的前瞻集是根据每个非终端的前瞻集来定义的,而后者又依赖于每个生产的先行集。确定先行集可以帮助我们确定语法是否为LL(1),如果是,我们需要为它构建递归下降解析器所需的信息。

定义: LOOKAHEAD(X - >α) LOOKAHEAD(X)

LOOKAHEAD(X -> α) = FIRST(α) U FOLLOW(X), if NULLABLE(α)
LOOKAHEAD(X -> α) = FIRST(α), if not NULLABLE(α)
LOOKAHEAD(X) = LOOKAHEAD(X -> α) U LOOKAHEAD(X -> β) U LOOKAHEAD(X -> γ)

其中 FIRST(α)是α可以开始的终端集合, FOLLOW(X)之后的终端集合X 语法中的任何地方, NULLABLE(α)是α是否可以导出一个空的终端序列(表示为ε)。以下定义取自Torben Mogensen的免费书籍Basics of Compiler Design请参阅下面的示例。

定义: NULLABLE(X)

NULLABLE(ε) = true
NULLABLE(x) = false, if x is a terminal
NULLABLE(αβ) = NULLABLE(α) and NULLABLE(β)
NULLABLE(P) = NULLABLE(α_1) or NULLABLE(α_2) or ... or NULLABLE(α_n),
               if P is a non-terminal and the right-hand-sides
               of all its productions are α_1, α_2, ..., α_n.

定义: FIRST(X)

FIRST(ε) = Ø
FIRST(x) = {x}, assuming x is a terminal
FIRST(αβ) = FIRST(α) U FIRST(β), if NULLABLE(α)
          = FIRST(α), if not NULLABLE(α)
FIRST(P) = FIRST(α_1) U FIRST(α_2) U ... U FIRST(α_n),
               if P is a non-terminal and the right-hand-sides
               of all its productions are α_1, α_2, ..., α_n.

定义: 关注(X)

  

终止符号a在 FOLLOW(X)中,当且仅当从语法的起始符号S推导出S⇒αXaβ时,其中α和β是(可能)空的)语法符号序列。

直觉: 关注(X)

  

查看语法中出现 X 的位置。所有跟随(直接或通过任何级别的递归)的终端都在 FOLLOW(X)中。另外,如果 X 在生产结束时发生(例如A -> foo X),或者后面跟着可以减少到ε的其他东西(例如A -> foo X B和{{1}然后,无论 A 可以跟着什么, X 也可以跟着(即B -> ε)。

请参阅Torben书中确定 FOLLOW(X)的方法,并在下面进行演示。

示例:

FOLLOW(A) ⊆ FOLLOW(X)

首先, NULLABLE FIRST 并确定:

E -> n A
A -> E B
A -> ε
B -> + A
B -> * A

在确定 FOLLOW 之前,添加了生产NULLABLE(E) = NULLABLE(n A) = NULLABLE(n) ∧ NULLABLE(A) = false NULLABLE(A) = NULLABLE(E B) ∨ NULLABLE(ε) = true NULLABLE(B) = NULLABLE(+ A) ∨ NULLABLE(* A) = false FIRST(E) = FIRST(n A) = {n} FIRST(A) = FIRST(E B) U FIRST(ε) = FIRST(E) U Ø = {n} (because E is not NULLABLE) FIRST(B) = FIRST(+ A) U FIRST(* A) = FIRST(+) U FIRST(*) = {+, *} ,其中E' -> E $被视为“文件结束”非终端。然后确定 FOLLOW

$

解决这些约束(也可以通过定点迭代实现),

FOLLOW(E): Let β = $, so add the constraint that FIRST($) = {$} ⊆ FOLLOW(E)
           Let β = B, so add the constraint that FIRST(B) = {+, *} ⊆ FOLLOW(E)
FOLLOW(A): Let β = ε, so add the constraint that FIRST(ε) = Ø ⊆ FOLLOW(A).
           Because NULLABLE(ε), add the constraint that FOLLOW(E) ⊆ FOLLOW(A).
           Let β = ε, so add the constraint that FIRST(ε) = Ø ⊆ FOLLOW(A).
           Because NULLABLE(ε), add the constraint that FOLLOW(B) ⊆ FOLLOW(A).
           Let β = ε, so add the constraint that FIRST(ε) = Ø ⊆ FOLLOW(A).
           Because NULLABLE(ε), add the constraint that FOLLOW(B) ⊆ FOLLOW(A).
FOLLOW(B): Let β = ε, so add the constraint that FIRST(ε) = Ø ⊆ FOLLOW(B).
           Because NULLABLE(ε), add the constraint that FOLLOW(A) ⊆ FOLLOW(B).

现在可以确定每个作品的 LOOKAHEAD

    {+, *, $} ⊆ FOLLOW(E)
    FOLLOW(E) ⊆ FOLLOW(A)
    FOLLOW(A) = FOLLOW(B)

    FOLLOW(E) = FOLLOW(A) = FOLLOW(B) = {+, *, $}.

最后,可以确定每个非终端的 LOOKAHEAD

LOOKAHEAD(E -> n A) = FIRST(n A) = {n}     because ¬NULLABLE(n A)
LOOKAHEAD(A -> E B) = FIRST(E B)           because ¬NULLABLE(E B)
                    = FIRST(E) = {n}       because ¬NULLABLE(E)
LOOKAHEAD(A -> ε)   = FIRST(ε) U FOLLOW(A) because NULLABLE(ε)
                    = Ø U {+, *, $} = {+, *, $}
LOOKAHEAD(B -> + A) = FIRST(+ A)           because ¬NULLABLE(+ A)
                    = FIRST(+) = {+}       because ¬NULLABLE(+)
LOOKAHEAD(B -> * A) = {*}                  for the same reason

根据这些知识,我们可以确定该语法不是LL(1),因为它的非终端具有重叠的先行集。 (即,我们无法创建一次只读取一个符号的程序,并明确决定使用哪种生产。)