如何使用ruby树梢解析多行?

时间:2018-07-30 22:56:29

标签: ruby parsing treetop

我是rubytreetop的新手。

我经历了this tutorial,并提出了以下规则。

grammar Sexp

  rule body
    commentPortString *(I am stuck here)*
  end

  rule interface
    space? (intf / intfWithSize) space? ('\n' / end_of_file) <Interface>
  end

  rule commentPortString
    space? '//' space portString space? ('\n' / end_of_file) <CommentPortString>
  end

  rule portString
    'Port' space? '.' <PortString>
  end

  rule expression
    space? '(' body ')' space? <Expression>
  end

  rule intf
    (input / output) space wire:wireName space? ';' <Intf>
  end

  rule intfWithSize
    (input / output) space? width:ifWidth space? wire:wireName space? ';' <IntfWithSize>
  end

  rule input
    'input'
  end

  rule output
    'output'
  end

  rule ifWidth
    '[' space? msb:digits space? ':' space? lsb:digits ']' <IfWidth>
  end

  rule digits
    [0-9]+
  end

  rule integer
    ('+' / '-')? [0-9]+ <IntegerLiteral>
  end

  rule float
    ('+' / '-')? [0-9]+ (('.' [0-9]+) / ('e' [0-9]+)) <FloatLiteral>
  end

  rule string
    '"' ('\"' / !'"' .)* '"' <StringLiteral>
  end

  rule signalTypeString
    '"' if_sig_name:signalType '"' <SignalTypeString>
  end

  rule signalType
    [a-zA-Z] [a-zA-Z0-9_]* (receiveLiteral / transmitLiteral) <SignalType>
  end

  rule receiveLiteral
    '.receive'
  end

  rule transmitLiteral
    '.transmit'
  end

  rule identifier
    [a-zA-Z\=\*] [a-zA-Z0-9_\=\*]* <Identifier>
  end

  rule wireName
    [a-zA-Z] [a-zA-Z0-9_]* <WireName>
  end

  rule non_space
    !space .
  end

  rule space
    [\s\t]+
  end

  rule newLine
    [\n\r]+
  end

  rule end_of_file
    !.
  end

end

我希望解析器提取出下面的blob。它始终以Port.开头,并以空白行结尾。

    // Port.
    output        send;
    input         free;
    output        fgcg;
    output[  2:0] state_id;
    output[  1:0] stream_id;
`ifdef SIMULATION
    output[ 83:0] dbg_id;
`endif

上面提到的规则在单独传递时可以识别文本中的所有行,但是我无法提取出斑点。我也只想提取出匹配的文本,而忽略其余文本。

有人可以指出我正确的方向吗?

1 个答案:

答案 0 :(得分:1)

类似于您要查找的内容,如下所示。如果没有更多信息,很难完全理解您的问题。

space规则包含\s,其中已经包含\n,因此,如果您要查找另一个\n,它将无法正确解析。如果将space规则修改为[^\S\n]+,它将排除\n,因此您可以显式查找它。

如果您要寻找一个完全空白的行来结束Port.块,则应该显式查找"\n" ("\n" / end_of_file)

希望如此...

grammar Sexp

  rule body
    commentPortString interface* portEnd
  end

  rule interface
    space? (intf / intfWithSize) space? "\n" <Interface>
  end

  rule commentPortString
    space? '//' space? portString space? "\n" <CommentPortString>
  end

  rule portString
    'Port' space? '.' <PortString>
  end

  # Port block ends with a blank line
  rule portEnd
    "\n" / end_of_file
  end

  rule expression
    space? '(' body ')' space? <Expression>
  end

  rule intf
    (input / output) space wire:wireName space? ';' <Intf>
  end

  rule intfWithSize
    (input / output) space? width:ifWidth space? wire:wireName space? ';' <IntfWithSize>
  end

  rule input
    'input'
  end

  rule output
    'output'
  end

  rule ifWidth
    '[' space? msb:digits space? ':' space? lsb:digits ']' <IfWidth>
  end

  rule digits
    [0-9]+
  end

  rule integer
    ('+' / '-')? [0-9]+ <IntegerLiteral>
  end

  rule float
    ('+' / '-')? [0-9]+ (('.' [0-9]+) / ('e' [0-9]+)) <FloatLiteral>
  end

  rule string
    '"' ('\"' / !'"' .)* '"' <StringLiteral>
  end

  rule signalTypeString
    '"' if_sig_name:signalType '"' <SignalTypeString>
  end

  rule signalType
    [a-zA-Z] [a-zA-Z0-9_]* (receiveLiteral / transmitLiteral) <SignalType>
  end

  rule receiveLiteral
    '.receive'
  end

  rule transmitLiteral
    '.transmit'
  end

  rule identifier
    [a-zA-Z\=\*] [a-zA-Z0-9_\=\*]* <Identifier>
  end

  rule wireName
    [a-zA-Z] [a-zA-Z0-9_]* <WireName>
  end

  rule non_space
    !space .
  end

  rule space
    [^\S\n]+
  end

  rule newLine
    [\n\r]+
  end

  rule end_of_file
    !.
  end

end