如何解析选定的表达式,直到pyparsing中的end关键字

时间:2012-04-03 04:04:41

标签: pyparsing

我正在解析verilog文件以提取其中的依赖项。我不需要解析大部分模块内容,但我只对include语句和模块实例感兴趣。我首先尝试只提取include语句。到目前为止,这是我的代码:

include_pragma = Group(Keyword("`include") + quotedString + lineEnd.suppress())
module_definition = ZeroOrMore(~Keyword("endmodule") + MatchFirst([include_pragma, restOfLine])) + Keyword("endmodule")

我的观点是,我要么匹配包含编译指示,要么匹配任何内容,直到我达到“endmodule”。如果我在下面的字符串上尝试这个语法,那么pyparsing会进入某种无限循环。

`include "InternalInclude.v"

localparam COMMA_WIDTH = 10;

localparam  UNKNOWN = 1'b0,
            KNOWN   = 1'b1;

reg     [DATA_WIDTH-1:0] TtiuPosQ2;
reg                      TtiuDetQ2;
reg     [           7:0] TtiuDetQ2Save;

assign DebugQO = {  DataQI,         // 65:34
                    TxTrainingEnQI, // 33
                    TtiuDetQ2,      // 32
                    TtiuPosQ2       // 31:0
                  };

always @(posedge Clk or posedge ResetI) begin
    if (ResetI) begin
        CommaPosQO <= {(DATA_WIDTH){1'b0}};
        CommaDetQO <= 0;
        CommaPosQ1 <= {(DATA_WIDTH){1'b0}};
        CommaPosQ2 <= {(DATA_WIDTH){1'b0}};
        CommaDetQ2 <= 0;
        StateQ <= 0;

        CommaPosSaveQ <= {(DATA_WIDTH){1'b0}};
        TtiuPosQ1 <= 0;
        TtiuPosQ2 <= 0;
        TtiuDetQ2 <= 0;
        TtiuDetQ2Save <= 8'h00;
    end
    else begin 
        CommaPosQO <= CommaPosC2;
        CommaDetQO <= CommaDetC2;
        CommaPosQ1 <= CommaPosC;
        CommaPosQ2 <= CommaPosQ1;
        CommaDetQ2 <= (| CommaPosQ1);
        StateQ <= StateC;
        CommaPosSaveQ <= CommaPosSaveC;
        TtiuPosQ1 <= TtiuPosC1;
        TtiuPosQ2 <= TtiuPosC2;
        TtiuDetQ2 <= TtiuDetC2;
        TtiuDetQ2Save <= TtiuDetQ2 ? 8'hFF : {1'b0, TtiuDetQ2Save[7:1]};
    end
end

endmodule

我可能误解了〜运算符。有什么建议吗?

更新

使用建议的setDebug()方法后,我发现无限循环打印出这个:

Match {Group:({"`include" quotedString using single or double quotes Suppress:(lineEnd)}) | Re:('.*')} at loc 0(1,1)
Matched {Group:({"`include" quotedString using single or double quotes Suppress:(lineEnd)}) | Re:('.*')} -> [['`include'
, '"InternalInclude.v"']]
Match {Group:({"`include" quotedString using single or double quotes Suppress:(lineEnd)}) | Re:('.*')} at loc 31(4,1)
Matched {Group:({"`include" quotedString using single or double quotes Suppress:(lineEnd)}) | Re:('.*')} -> ['']
Match {Group:({"`include" quotedString using single or double quotes Suppress:(lineEnd)}) | Re:('.*')} at loc 31(4,1)
Matched {Group:({"`include" quotedString using single or double quotes Suppress:(lineEnd)}) | Re:('.*')} -> ['']
Match {Group:({"`include" quotedString using single or double quotes Suppress:(lineEnd)}) | Re:('.*')} at loc 31(4,1)
Matched {Group:({"`include" quotedString using single or double quotes Suppress:(lineEnd)}) | Re:('.*')} -> ['']

我做的是什么导致解析位置不能向前移动?

1 个答案:

答案 0 :(得分:2)

问题在于使用restOfLine表达式。它可以匹配任何东西,包括''。因此,解析器将按如下方式进行:

  1. 已匹配include "InternalInclude.v"
  2. 匹配''到restOfLine
  3. 将相同的''与restOfLine匹配,因为它没有消耗\ n并且在\ n之前仍然有'',因为alwasy将是;)
  4. 要解决此问题,我已将restOfLine更改为(restOfLine + lineEnd)。这会强制解析器在匹配行之后使用\ n。新的解析器读取:

    include_pragma = Group(Keyword("`include") + quotedString + lineEnd.suppress())
    module_definition = ZeroOrMore(~Keyword("endmodule") + MatchFirst([include_pragma, (restOfLine + lineEnd)])) + Keyword("endmodule")