Question

我正在使用flex和bison为大学作业编写解析器。目前，我的目标是阅读由整数，字符串及其运算符组成的表达式。整数运作良好 - 问题在于字符串。在我运行程序之后，当我在控制台中键入一个字符串时，它应该打印出表达式的结果 - 在这种情况下，它是一个字符串类型，后跟字符串的值。因此，如果我键入＆＃34; hello＆＃34;，我应该回来＆＃34; it：String =＆＃34; hello＆＃34;＆＃34;。问题是，在我最后减少bison文件（其中bison使用其中一个起始变量的规则减少到起始变量）时，字符串值以某种方式在其结尾处获得换行符。所以字符串最终是＆＃34; hello \ n＆＃34;，所以它：String =＆＃34; hello＆＃34; \ n被打印到控制台。我已经通过解析跟踪确认字符串值在最后一次缩减之前是正确的，然后它获得了换行符，我无法弄清楚原因。我认为一些代码片段会非常清楚问题。

这是lex文件的重要部分。最后一条规则是我返回STRING令牌的地方。

%{
#include <iostream>
#include <string>
#include <stdlib.h>
#include "y.tab.h"
using namespace std;
void yyerror(char*);
%}

%%

0                       { yylval.iVal = atoi(yytext);
                          return INTEGER;
                        }

[1-9][0-9]*             { yylval.iVal = atoi(yytext);
                          return INTEGER;
                        }

[-+()~$^*/;\n]          return *yytext;
"=="                    return EQ;
"!="                    return NE;
"&&"                    return AND;
"||"                    return OR;
"\""[^"\""]*"\""        { yylval.strVal = yytext;
                          return STRING; }

这是yacc文件。在应用规则＆＃34;程序：程序strExpr＆＃39; \ n＆＃39; ＆＃34;，那是我打印控制台响应的地方。

%token EQ NE AND OR STRFIND
%token<iVal> INTEGER
%token<strVal> STRING
%left OR
%left AND
%left EQ NE
%left '+' '-'
%left '*' '/'
%left UNARY
%right '^'

%{
    #include <iostream>
    #include <cmath>
    #include <string>
    #define YYDEBUG 1
    using namespace std;
    void yyerror(char *);
    int yylex(void);
%}

%union {
    int iVal;
    char* strVal;
}

%type<iVal> intExpr
%type<strVal> strExpr

%printer {fprintf(yyoutput, "%s", $$);} strExpr

%%

program:
    program intExpr '\n'         {cout<<"it:Int="<<$2<<"\n";}
    | program strExpr '\n'       {cout<<"it:String="<<$2<<"\n";}
    | program intExpr ';'
    | program strExpr ';'
    | program intExpr ';' '\n'
    | program strExpr ';' '|n'
    | program '\n'
    | program ';'
    | program ';' '\n'
    | ;
expr:
    intExpr
    | strExpr

intExpr:
    INTEGER
    | '-' intExpr %prec UNARY          { $$ = $2 * (-1); }
    | '+' intExpr %prec UNARY          { $$ = $2; }
    | intExpr '+' intExpr              { $$ = $1 + $3; }
    | intExpr '*' intExpr              { $$ = $1 * $3; }
    | intExpr '-' intExpr              { $$ = $1 - $3; }
    | intExpr '/' intExpr              { if ($3 == 0) {
                                           yyerror(0);
                                           return 1;
                                       } else
                                           $$ = $1 / $3; }
    | '(' intExpr ')'                  { $$ = $2; }
    | intExpr '^' intExpr              { int i;
                                         int val = 1;
                                         for (i = 0; i < $3; i++) {
                                             val = val * $1;
                                         }
                                         $$ = val;
                                       }
    | intExpr EQ intExpr               { if ($1 == $3)
                                             $$ = 1;
                                         else
                                             $$ = 0;
                                       }
    | intExpr NE intExpr               { if ($1 != $3)
                                             $$ = 1;
                                         else
                                             $$ = 0;
                                       }
    | intExpr AND intExpr              { if ($1 != 0 && $3 != 0)
                                             $$ = 1;
                                         else
                                             $$ = 0;
                                       }
    | intExpr OR intExpr               { if ($1 != 0 || $3 != 0)
                                             $$ = 1;
                                         else
                                             $$ = 0;
                                       }
    | ;

strExpr:
    STRING                             
    | '(' strExpr ')'                  { $$ = $2; }
    | ;

%%

void yyerror(char *s) {
    fprintf(stderr, "error\n");
}

int main(void) {
    yydebug = 1;
    yyparse();
    return 0;
}

以下是样本运行的输出：

"hello"
it:String="hello"

1+1
it:Int=2
3+4
it:Int=7

之后的额外换行符是什么：String =＆＃34; hello＆＃34;？

这是解析跟踪，它告诉我在最后一次缩减之前正在添加换行符，但我对为什么感到茫然？

Starting parse
Entering state 0
Reducing stack by rule 10 (line 45):
-> $$ = nterm program ()
Stack now 0
Entering state 1
Reading a token: "hello"
Next token is token STRING ()
Shifting token STRING ()
Entering state 4
Reducing stack by rule 25 (line 93):
   $1 = token STRING ()
-> $$ = nterm strExpr ("hello")
Stack now 0 1
Entering state 11
Reading a token: Next token is token '\n' ()
Shifting token '\n' ()
Entering state 29
Reducing stack by rule 2 (line 37):
   $1 = nterm program ()
   $2 = nterm strExpr ("hello"
)
   $3 = token '\n' ()
it:String="hello"

-> $$ = nterm program ()
Stack now 0
Entering state 1
Reading a token:

我很感激你的帮助。

Answer 1

yylval.strVal = yytext;

yytext是一个指向静态缓冲区的指针。每次读取令牌时，缓冲区内容都会更改。

yylval.strVal = strdup(yytext);

这将摆脱换行符，但当然会引入内存泄漏。你需要照顾它。

为什么我的字符串令牌在我的c ++ bison程序最终减少时会获得换行符？

1 个答案: