Question

我在Treetop中有这对工作规则，我认为完美主义者应该是唯一的规则，或者至少更美丽的事情：

rule _
  crap
  /
  " "*
end

rule crap
  " "* "\\x0D\\x0A"* " "*
end

我正在解析一些表达式，时不时地以“\ x0D \ x0A”结束。是的，不是“\ r \ n”而是“\ x0D \ x0A”。有些东西在某些时候被双重逃脱了。长话故事。

这条规则有效，但它很难看而且困扰我。我试过这个：

rule _
  " "* "\\x0D\\x0A"* " "*
  /
  " "*
end

导致

SyntaxError: (eval):1276:in `load_from_string': compile error
(eval):1161: class/module name must be CONSTANT
    from /.../gems/treetop-1.4.9/lib/treetop/compiler/grammar_compiler.rb:42:in `load_from_string'
    from /.../gems/treetop-1.4.9/lib/treetop/compiler/grammar_compiler.rb:35:in `load'
    from /.../gems/treetop-1.4.9/lib/treetop/compiler/grammar_compiler.rb:32:in `open'
    from /.../gems/treetop-1.4.9/lib/treetop/compiler/grammar_compiler.rb:32:in `load'

理想情况下，我想写一些类似的东西：

rule _
  (" " | "\\x0D\\x0A")*
end

但这不起作用，虽然我们在这里，但我还发现每条规则不能只有一个*：

rule _
  " "*
  /
  "\n"*
end

将匹配“”，但永远不会\ n。

Answer 1

我发现您使用了三个不同的OR字符：/，|和\（其中只有第一个表示OR）。< / p>

这很好用：

grammar Language

  rule crap
    (" " / "\\x0D\\x0A")* {
      def value
        text_value    
      end
    }
  end

end

#!/usr/bin/env ruby

require 'rubygems'
require 'treetop'
require 'polyglot'
require 'language'

parser = LanguageParser.new
value = parser.parse(' \\x0D\\x0A   \\x0D\\x0A   ').value
print '>' + value + '<'

打印：

> \x0D\x0A   \x0D\x0A   <

Answer 2

你说“我也发现你不能只有一个*每条规则”（你的意思是：你可以拥有），“那将匹配”“，但永远不会\ n”。

当然;当匹配零空格字符时，规则成功。您可以使用+代替：

rule _
  " "+
  /
  "\n"*
end

如果要匹配任意数量的空格或换行符，也可以将空格字符括起来：

rule _
  (" " / "\n")*
end

您的错误“类/模块名称必须为CONSTANT”是因为规则名称用作模块名称的前缀，以包含附加到规则的任何方法。模块名称不能以下划线开头，因此您不能在名称以下划线开头的规则中使用方法。

我相信这应该是Treetop的一个规则

2 个答案: