PEG语法没有按预期工作

时间:2014-03-30 22:01:57

标签: parsing clojure peg

我正在研究一种PEG语法,该语法采用音乐编程语言中的代码并创建音乐事件的解析树(音符,和弦,音量/速度变化等)。我的MPL的一个特点是它支持语音,即同时发生的不同事件序列。我很难让我的Instaparse语法正确解析这个...我想要的是一个voices标记,它由一个或多个voice组成,每个标记由V1:组成语音定义(例如voices)然后是任意数量的事件。 V0:标记应该以{{1​​}}结尾(这意味着分割语音的结束,我们只回到一个语音,或“语音零”)或文件的结尾。

以下是我正在进行的语法摘录(为了清楚起见,我省略了notechord等的定义:

part                    = <ows> event+
<event>                 = chord | note | rest | octave-change |
                          attribute-change | voices |
                          marker | at-marker

voices                  = voice+ 
voice                   = !voices voice-number voice-events? 
                          (<voice-zero> | #"\z")
voice-number            = <"V"> #"[1-9]\d*" <":"> <ows>
<voice-zero>            = <"V0:"> <ows>
voice-events            = !voices event+ 

...

ows                     = #"\s*"

给出以下代码:

V1: o2 b1/>b o2 g+/>g+ o2 g/>g 
V0: e8 f+ g+ a b2

运行解析器会提供以下输出:

[:part 
  [:voices 
    [:voice [:voice-number "1"] 
            [:voice-events 
              [:octave-change "2"] [:chord [:note [:pitch "b"] 
              [:duration "1"]] [:octave-change ">"] [:note [:pitch "b"]]] 
              [:octave-change "2"] [:chord [:note [:pitch "g+"]] 
              [:octave-change ">"] [:note [:pitch "g+"]]] 
              [:octave-change "2"] [:chord [:note [:pitch "g"]]
              [:octave-change ">"] [:note [:pitch "g"]]]]]] 
  [:note [:pitch "e"] [:duration "8"]] 
  [:note [:pitch "f+"]] 
  [:note [:pitch "g+"]] 
  [:note [:pitch "a"]] 
  [:note [:pitch "b"] [:duration "2"]]]

这正是我想要的。 V0:标示voices标记的结尾,最后5个标注位于part标记内。

但是,当我将V0更改为V2时,我明白了:

[:part 
  [:voices 
    [:voice [:voice-number "1"] 
            [:voice-events 
              [:octave-change "2"] [:chord [:note [:pitch "b"] [:duration "1"]] 
              [:octave-change ">"] [:note [:pitch "b"]]] [:octave-change "2"] 
              [:chord [:note [:pitch "g+"]] [:octave-change ">"] 
              [:note [:pitch "g+"]]] [:octave-change "2"] 
              [:chord [:note [:pitch "g"]] [:octave-change ">"] 
              [:note [:pitch "g"]]] 
              [:voices 
                [:voice [:voice-number "2"] 
                [:voice-events 
                  [:note [:pitch "e"] [:duration "8"]] [:note [:pitch "f+"]] 
                  [:note [:pitch "g+"]] [:note [:pitch "a"]] 
                  [:note [:pitch "b"] [:duration "2"]]]]]]]]]

出于某种原因,voice 1标记或其voice-events标记未按预期终止,第二个voice作为第一个voice的一部分被吞噬1}} voice-events。我也不希望有第二个voices标签; voice 2应位于主voices标记内。

我想要的是:

[:part 
  [:voices 
    [:voice [:voice-number "1"] 
            [:voice-events 
              [:octave-change "2"] [:chord [:note [:pitch "b"] [:duration "1"]] 
              [:octave-change ">"] [:note [:pitch "b"]]] [:octave-change "2"] 
              [:chord [:note [:pitch "g+"]] [:octave-change ">"] 
              [:note [:pitch "g+"]]] [:octave-change "2"] 
              [:chord [:note [:pitch "g"]] [:octave-change ">"] 
              [:note [:pitch "g"]]]]]
    [:voice [:voice-number "2"] 
            [:voice-events 
              [:note [:pitch "e"] [:duration "8"]] [:note [:pitch "f+"]] 
              [:note [:pitch "g+"]] [:note [:pitch "a"]] 
              [:note [:pitch "b"] [:duration "2"]]]]]]

我无法弄清楚我做错了什么,但我认为这与我如何定义voice标记和/或voice-events标记有关。这可能与我如何使用负向前瞻有关,我认为我还没有完全理解。任何人都可以弄明白我如何修复我的语法?

谢谢! :)

解决!

谢谢,@丹尼尔!我已经将我的语法重新用于此,它的工作方式与我想要的完全相同:

part                    = <ows> (voices | event)+
<event>                 = chord | note | rest | octave-change |
                          attribute-change | marker | at-marker

voices                  = voice+ (<voice-zero> | <#"\z">)
voice                   = voice-number event*
voice-number            = <"V"> #"[1-9]\d*" <":"> <ows>
<voice-zero>            = <"V0:"> <ows>

...

ows                     = #"\s*"

最大的变化在于我如何定义partevent;之前,我已经定义了这些术语,voices是一个事件,因此任何后续的voice被消耗并被归入先前的voice event个。通过从voices的定义中提取event并将part重新定义为voices分组或event的可变数量,我消除了模糊性得到了语法,以我想要的方式行事。

之后,events中的voice正在进行分组,但我仍然遇到问题,每个语音都在自己单独的voices标记内,当我需要它们时都属于同一voices分组。我通过指定voices标记以"V0:"或文件末尾(\z)结尾来修复此问题,换句话说,更具体地说明我想要多少代码{ {1}}要使用的代码。

故事的寓意是,如果你正在编写PEG语法并且遇到问题,你可能需要使你的定义不那么模糊!我最终也没有使用负向前瞻,我认为这有助于简化/去模糊我的语法。

1 个答案:

答案 0 :(得分:2)

我认为你是对的 - 这是导致问题的负面预测。 没有你的完整语法,我无法正确测试,但这一行:

voice-events = !voices event+ 

表示与voices 不匹配的内容,后跟一个或多个events

我假设voice-events不能以递归的方式包含voices,但目前它不应该间接包含event。每个voices都可以包含voice-events,而events可以包含event

在上面的示例中,V1中的第一个事件是八度移位(与非语音条件匹配)。这允许发生的后续语音在voice-event = chord | note | rest | octave-change | attribute-change | marker | at-marker event = voice-event | voices 定义内消耗。如果这是有道理的。

要解决此问题,您可以(或许)以相反的方式定义它:

{{1}}