Question

我有以下TT.jj，如果我取消注释下面的SomethingElse部分，它会成功解析create create blahblah或create blahblah形式的语言。但是如果我注释掉下面的SomethingElse部分，但保留LOOKAHEAD，javacc会抱怨前瞻不是必须的，而且＃34;忽略了＃34;但是生成的解析器只接受一个空字符串。

我认为javacc说它被忽略了＃34;所以它不应该有任何影响？基本上多余的LOOKAHEAD会导致错误。这是如何工作的？也许javacc对LOOKAHEAD的实现并不完全符合规范？

     options{
        IGNORE_CASE=true ;
        STATIC=false;
            DEBUG_PARSER=true;
        DEBUG_LOOKAHEAD=false;
        DEBUG_TOKEN_MANAGER=false;
    //  FORCE_LA_CHECK=true;
        UNICODE_INPUT=true;
    }

    PARSER_BEGIN(TT)

    import java.util.*;

    /**
     * The parser generated by JavaCC
     */
    public class TT {

    }

    PARSER_END(TT)


    ///////////////////////////////////////////// main stuff concerned
    void Statement() :
    { }
    {
    LOOKAHEAD(2)
    CreateTable()
    //|
    //SomethingElse()
    }

    void CreateTable():
    {
    }
    {
            <K_CREATE> <K_CREATE> <S_IDENTIFIER>
    }

    //void SomethingElse():
    //{}{
    //      <K_CREATE> <S_IDENTIFIER>
    //}
    //
    //////////////////////////////////////////////////////////


SKIP:
{
    " "
|   "\t"
|   "\r"
|   "\n"
}

TOKEN: /* SQL Keywords. prefixed with K_ to avoid name clashes */
{
<K_CREATE: "CREATE">
}


TOKEN : /* Numeric Constants */
{
   < S_DOUBLE: ((<S_LONG>)? "." <S_LONG> ( ["e","E"] (["+", "-"])? <S_LONG>)?
                        |
                        <S_LONG> "." (["e","E"] (["+", "-"])? <S_LONG>)?
                        |
                        <S_LONG> ["e","E"] (["+", "-"])? <S_LONG>
                        )>
  |     < S_LONG: ( <DIGIT> )+ >
  |     < #DIGIT: ["0" - "9"] >
}


TOKEN:
{
        < S_IDENTIFIER: ( <LETTER> | <ADDITIONAL_LETTERS> )+ ( <DIGIT> | <LETTER> | <ADDITIONAL_LETTERS> | <SPECIAL_CHARS>)* >
|       < #LETTER: ["a"-"z", "A"-"Z", "_", "$"] >
|   < #SPECIAL_CHARS: "$" | "_" | "#" | "@">
|   < S_CHAR_LITERAL: "'" (~["'"])* "'" ("'" (~["'"])* "'")*>
|   < S_QUOTED_IDENTIFIER: "\"" (~["\n","\r","\""])+ "\"" | ("`" (~["\n","\r","`"])+ "`") | ( "[" ~["0"-"9","]"] (~["\n","\r","]"])* "]" ) >

/*
To deal with database names (columns, tables) using not only latin base characters, one
can expand the following rule to accept additional letters. Here is the addition of german umlauts.

There seems to be no way to recognize letters by an external function to allow
a configurable addition. One must rebuild JSqlParser with this new "Letterset".
*/
|   < #ADDITIONAL_LETTERS: ["ä","ö","ü","Ä","Ö","Ü","ß"] >
}

Answer 1

JavaCC认为忽略的先行规范不会被忽略。道德：不要把先行规格放在非选择点上。

更详细。当前瞻（除了纯粹的语义前瞻）出现在非选择点时，它似乎生成一个总是返回false的先行方法，因此前瞻失败，没有其他选择，抛出异常。

Answer 2

这是来自bad .jj

的生成代码

      final public void Statement() throws ParseException {
    trace_call("Statement");
    try {
      if (jj_2_1(5)) {

      } else {
        jj_consume_token(-1);
        throw new ParseException();
      }   
      CreateTable();
    } finally {
      trace_return("Statement");
    }     
  }

这是好的：

  final public void Statement() throws ParseException {
    trace_call("Statement");
    try {
      if (jj_2_1(3)) {
        CreateTable();
      } else {
        switch ((jj_ntk==-1)?jj_ntk():jj_ntk) {
        case K_CREATE:
          SomethingElse();
          break;
        default:
          jj_la1[0] = jj_gen;
          jj_consume_token(-1);
          throw new ParseException();
        }
      } 
    } finally {
      trace_return("Statement");
    } 
  }

即。多余的LOOKAHEAD完全没有被忽略，javacc机械地尝试列出if-else结构中的所有选项（在坏情况下都没有）并导致直接查找EOF的语法

javacc中多余的LOOKAHEAD导致错误？

2 个答案: