Question

我只想提取attribute1和attribute3值。我不明白为什么charset在我的情况下似乎不能“跳过”任何其他属性（attribute3未按我的意愿提取）：

content: {<tag attribute1="valueattribute1" attribute2="valueattribute2" attribute3="valueattribute3">
</tag>
<tag attribute2="valueattribute21" attribute1="valueattribute11" >
</tag>
}


attribute1: [{attribute1="} copy valueattribute1 to {"} thru {"}]
attribute3: [{attribute3="} copy valueattribute3 to {"} thru {"}]

spacer: charset reduce [tab newline #" "]
letter: complement spacer 
to-space: [some letter | end]

attributes-rule: [(valueattribute1: none valueattribute3: none) [attribute1 | none] any letter [attribute3 | none] (print valueattribute1 print valueattribute3)
| [attribute3 | none] any letter [attribute1 | none] (print valueattribute3 print valueattribute1
valueattribute1: none valueattribute3: none
)
| none
]

rule: [any [to {<tag } thru {<tag } attributes-rule {>} to {</tag>} thru {</tag>}] to end]

parse content rule

输出

>> parse content rule
valueattribute1
none
== true
>>

Answer 1

简短回答，[任何字母]吃你的attribute3 =“...”，因为＃“^”“字符是你定义的'字母。此外，你可能有没有attribute2的问题，那么你的通用第二个属性规则将吃属性3，你的attribute3规则将没有任何东西可以匹配 - 更好地明确有一个可选的attribute2或一个可选的what-but-attribute3

attribute1="foo"       attribute2="bar" attribute3="foobar" 
<- attribute1="..." -> <-     any letter                 -> <- attibute3="..." ->

此外，'没有/ all细化的解析会忽略空格（或者至少在空格方面非常笨重） - /所有强烈建议用于此类解析。

Answer 2

首先，您没有使用parse/all。在Rebol 2中，这意味着在解析运行之前已经有效地删除了空格。在Rebol 3中并非如此：如果您的解析规则是块格式（正如您在此处所做的那样），则隐含/all。

（注意：Rebol 3似乎已达成共识throw out the non-block form of parse rules，支持那些“最小”解析场景的split函数。这将摆脱{{1不幸的是，还没有对此采取任何行动。）

其次你的代码有bug，我不会花时间整理出来。（这主要是因为我认为使用Rebol的解析来处理XML / HTML是一个相当愚蠢的想法：P）

但不要忘记你有一个重要的工具。如果在解析规则中使用set-word，则会将解析位置捕获到变量中。然后，您可以将其打印出来并查看您所在的位置。将/all中您首先说attribute-rule的部分更改为any letter，您会看到：

pos: (print pos) any letter

查看领先空间？你的规则就在>> parse/all content rule attribute2="valueattribute2" attribute3="valueattribute3"> </tag> <tag attribute2="valueattribute21" attribute1="valueattribute11" > </tag> valueattribute1 none == true把你放到一个空间之前...因为你说任何一封信都没问题，所以没有任何信件都可以，一切都被抛弃了。

（注意：Rebol 3有一个更好的调试工具......单词any letter。当你把它放在解析块中时，它会告诉你当前处理的令牌/规则是什么作为输入的状态。使用此工具，您可以更轻松地找到正在发生的事情：

??

...虽然它现在在r3 mac intel上真的很麻烦。）

此外，如果您没有使用>> parse "hello world" ["hello" ?? space ?? "world"] space: " world" "world": "world" == true，那么您的copy模式是不必要的，只需to X thru X就可以实现这一目标。如果你想复制一份，你也可以用简短的thru X来做，或者如果它只是一个符号就可以写出更清晰的copy Y to X X

在您看到自己编写重复代码的地方，请记住Rebol可以使用copy Y to X skip等进一步：

compose

Answer 3

添加解析/全部时，它似乎没有改变任何东西。最后这似乎有用（使用set-word确实对调试有很大的帮助!!!），您怎么看？

content: {<tag attribute1="valueattribute1" attribute2="valueattribute2" attribute3="valueattribute3">
</tag>
<tag attribute2="valueattribute21" attribute1="valueattribute11" >
</tag>
}


attribute1: [to {attribute1="} thru {attribute1="} copy valueattribute1 to {"} thru {"}]
attribute3: [to {attribute3="} thru {attribute3="} copy valueattribute3 to {"} thru {"}]

letter: charset reduce ["ABCDEFGHIJKLMNOPQRSTUabcdefghijklmnopqrstuvwxyz1234567890="]

attributes-rule: [(valueattribute1: none valueattribute3: none) 
[attribute1 | none] any letter pos: 
[attribute3 | none] (print valueattribute1 print valueattribute3)
| [attribute3 | none] any letter [attribute1 | none] (print valueattribute3 print valueattribute1
valueattribute1: none valueattribute3: none
)
| none
]

rule: [any [to {<tag } thru {<tag } attributes-rule {>} to {</tag>} thru {</tag>}] to end]

parse content rule

输出：

>> parse/all content rule
valueattribute1
valueattribute3
valueattribute11
none
== true
>>

解析和charset：为什么我的脚本不起作用

3 个答案: