Question

这不是作业，而是来自一本书。我给了以下语法：

%{
#include <stdio.h>
#include <ctype.h>

int yylex();
int yyerror();
%}

%%

command : exp '\n' { printf("%d\n", $1); exit(0); }
        | error '\n'
          { 
            yyerrok;
            printf("reenter expression: "); 
          }
          command
        ;

exp : exp '+' term { $$ = $1 + $3; }
    | exp '-' term { $$ = $1 - $3; }
    | term { $$ = $1; }
    ;

term : term '*' factor { $$ = $1 * $3; }
     | factor { $$ = $1; }
     ;

factor : NUMBER { $$ = $1; }
       | '(' exp ')' { $$ = $2; }
       ;

%%

int main() {
  return yyparse();
}

int yylex() {
  int c;

  /* eliminate blanks*/
  while((c = getchar()) == ' ');

  if (isdigit(c)) {
    ungetc(c, stdin);
    scanf("%d\n", &yylval);
    return (NUMBER);
  }

  /* makes the parse stop */
  if (c == '\n') return 0;

  return (c);
}

int yyerror(char * s) {
  fprintf(stderr, "%s\n", s);
  return 0;
} /* allows for printing of an error message */

这是任务：

为计算器程序建议的简单错误恢复技术存在缺陷许多错误后可能导致堆栈溢出。重写它以删除这个问题。

我无法弄清楚堆栈溢出是如何发生的。鉴于开始生产是唯一一个有错误令牌的生产，yacc / bison不会弹出堆栈中的所有元素并在重新启动之前？

Answer 1

如果有疑问，最简单的方法就是使用野牛。

我稍微修改了程序以避免错误。首先，由于新程序依赖于查看'\n'令牌，因此我删除了阻止发送if (c == '\n') return 0;的行'\n'。其次，我将scanf("%d\n", &yylval);修改为scanf("%d", &yylval);。如果数字后面的空格是换行符，那么没有理由在数字之后吞下空格，特别是。（但是，scanf模式不区分不同类型的空格，因此模式"%d\n"与"%d "具有完全相同的语义。这两种模式都不正确。）< / p>

然后我在yydebug = 1;的顶部添加了行main，并在构建计算器时向bison提供了-t（＆＃34; trace＆＃34;）选项。这会导致解析器在处理输入时详细显示其进度。

有助于获取状态表转储以查看正在进行的操作。您可以使用-v bison选项执行此操作。不过，我会把它留给读者。

然后我运行程序并故意输入语法错误：

./error
Starting parse
Entering state 0
Reading a token: 2++3

跟踪工具已经输出了两行，但是在我给它一些输入之后，跟踪就会倾泻而出。

首先，解析器吸收了NUMBER 2和运算符+ :(注意：下面的nterm是野牛的说法＆＃34;非终端＆＃ 34;，而token是＆＃34;终端＆＃34 ;;堆栈只显示州号。）

Next token is token NUMBER ()
Shifting token NUMBER ()
Entering state 2
Reducing stack by rule 9 (line 25):
   $1 = token NUMBER ()
-> $$ = nterm factor ()
Stack now 0
Entering state 7
Reducing stack by rule 8 (line 22):
   $1 = nterm factor ()
-> $$ = nterm term ()
Stack now 0
Entering state 6
Reading a token: Next token is token '+' ()
Reducing stack by rule 6 (line 18):
   $1 = nterm term ()
-> $$ = nterm exp ()
Stack now 0
Entering state 5
Next token is token '+' ()
Shifting token '+' ()
Entering state 12

到目前为止，这么好。状态12是我们在看到+之后到达的地方;这是它的定义：

State 12

    4 exp: exp '+' . term
    7 term: . term '*' factor
    8     | . factor
    9 factor: . NUMBER
   10       | . '(' exp ')'

    NUMBER  shift, and go to state 2
    '('     shift, and go to state 3

    term    go to state 17
    factor  go to state 7

（默认情况下，野牛不会使状态表与非核心项混乱。我添加-r itemset来获取完整的项目集，但是手动完成关闭本来很容易。）

由于在此状态下我们正在查找+的右侧操作数，因此只有可以启动表达式的内容才有效：NUMBER和(。但那不是我们得到的：

Reading a token: Next token is token '+' ()
syntax error

好的，我们在状态12中，如果您查看上面的状态说明，您会发现error也不在先行集中。所以：

Error: popping token '+' ()
Stack now 0 5

这使我们回到状态5，这是预期运营商的地方：

State 5

    1 command: exp . '\n'
    4 exp: exp . '+' term
    5    | exp . '-' term

    '\n'  shift, and go to state 11
    '+'   shift, and go to state 12
    '-'   shift, and go to state 13

因此该状态也不会在error上进行转换。起。

Error: popping nterm exp ()
Stack now 0

好的，回到开头。状态0 具有error转换：

   error   shift, and go to state 1

现在我们可以转移error令牌并输入状态1，如转换表所示：

Shifting token error ()
Entering state 1

现在我们需要通过跳过输入令牌来同步输入，直到我们到达换行令牌。（请注意，野牛实际上会弹出并推送错误令牌，而不是让它分散你的注意力。）

Next token is token '+' ()
Error: discarding token '+' ()
Error: popping token error ()
Stack now 0
Shifting token error ()
Entering state 1
Reading a token: Next token is token NUMBER ()
Error: discarding token NUMBER ()
Error: popping token error ()
Stack now 0
Shifting token error ()
Entering state 1
Reading a token: Next token is token '\n' ()
Shifting token '\n' ()
Entering state 8

是的，我们找到了换行符。州5是command: error '\n' . $@1 command。 $@1是标记（空产品）的名称，其中bison插入代替中间规则操作（MRA）。状态8将减少此标记，导致MRA运行，这要求我提供更多输入。请注意，此时错误恢复已完成。我们现在处于一个完全正常的状态，并且堆栈反映了这样的事实：按顺序，我们有一个开始（状态0），一个error令牌（状态1）和一个换行令牌（状态8）： / p>

Reducing stack by rule 2 (line 13):
-> $$ = nterm $@1 ()
Stack now 0 1 8
Entering state 15
Reading a token: Try again:

MRA减少后，国家8采取了相应的行动，我们进入州15（为了避免混乱，我遗漏了非核心项目）：

State 15

    3 command: error '\n' $@1 . command

    error   shift, and go to state 1
    NUMBER  shift, and go to state 2
    '('     shift, and go to state 3

所以现在我们已经准备好按照预期解析一个全新的命令了。但我们还没有减少错误产生;它仍然在堆栈上，因为它无法减少，直到点后面的command减少。我们还没有开始它。

但重要的是要注意State 15 在error上有转换，正如您可以从状态的goto表中看到的那样。它有转换，因为闭包包括command的两个作品：

    1 command: . exp '\n'
    3        | . error '\n' $@1 command

以及exp，term和factor的制作，这也是封闭的一部分。

那么如果我们现在输入另一个错误会怎么样？堆栈将弹回到此点（0 1 8 15），新的error令牌将被推送到堆栈（0 1 8 15 1），令牌将被丢弃，直到换行符可以移位（ 0 1 8 15 1 8）和一个新的MRA（$@1，就像野牛所说的那样）将被缩减到堆栈（0 1 8 15 1 8 15），此时我们已准备好开始解析另一次尝试

希望你能看到它的发展方向。

请注意，它与任何其他权利递归生产的效果没有什么不同。如果语法试图接受一些表达式：

prog: exp '\n'
    | exp '\n' { printf("%d\n", $1); } prog

你会看到相同的堆栈构建，这就是不鼓励右递归的原因。（还因为您最终插入MRA以避免以相反的顺序看到结果，因为在所有输入结束时堆栈减少到prog。）

    command  go to state 20
    exp      go to state 5
    term     go to state 6
    factor   go to state 7

不清楚yacc / bison生产规范如何导致堆栈溢出

1 个答案: