Question

为了纪念Rebol 3开源any-minute-now (?)，我又回来搞乱了。作为练习，我试图用PARSE方言编写自己的JSON解析器。

自道格拉斯·克罗克福德credits influence of Rebol on his discovery of JSON以来，我认为这很容易。除了用括号替换大括号并删除所有这些逗号之外，仅仅在字符串上使用LOAD的障碍之一就是当他们想要做相当于SET-WORD!的事情时他们会使用某些东西看起来像是Rebol的tokenizer的字符串，后面有一个非法的迷路冒号：

{
    "key one": {
         "summary": "This is the string content for key one's summary",
         "value": 7
    },
    "key two": {
         "summary": "Another actually string, not supposed to be a 'symbol'",
         "value": 100
    }
}

基本上我想找到所有类似"foo bar":的案例并将它们转换为foo-bar:，同时留下未单独使用冒号的匹配引号对。

当我在PARSE中解决这个问题时（原则上我理解得很好，但仍然没有使用太多），出现了几个问题。但主要是，当你可以逃脱到代码并从解析器下修改系列时，承诺的条件是什么...特别是在Rebol 3中？更一般地说，它是“适合工作的工具”吗？

这是我尝试的规则，似乎适用于这部分任务：

any [
    ; require a matched pair of quotes & capture series positions before
    ; and after the first quote, and before the last quote

    to {"} beforePos: skip startPos: to {"} endPos: skip

    ; optional colon next (if not there the rest of the next rule is skipped)

    opt [
        {:}

        ; if we got to this part of the optional match rule, there was a colon.
        ; we escape to code changing spaces to dashes in the range we captured

        (
            setWordString: copy/part startPos endPos
            replace/all setWordString space "-"
            change startPos setWordString
        )

        ; break back out into the parse dialect, and instead of changing the 
        ; series length out from under the parser we jump it back to the position
        ; before that first quote that we saw

        :beforePos

        ; Now do the removals through a match rule.  We know they are there and
        ; this will not cause this "colon-case" match rule to fail...because we
        ; saw those two quotes on the first time through!

        remove [{"}] to {"} remove [{"}]
    ]
]

可以吗？开放代码中的change startPos setWordString是否有可能破坏外部解析...如果不是在这种情况下，则会出现微妙的不同之处？

与往常一样，任何教学“它更干净/更短/更好用其他方式”建议表示赞赏。

P.S。为什么没有replace/all/part？

Answer 1

change，insert和remove等新关键字可以促进这类事情。我想这种方法的主要缺点是推动系列的延迟问题（我已经看到提到构建新字符串比操作更快）。

token: [
    and [{"} thru {"} any " " ":"]
    remove {"} copy key to {"} remove {"} remove any " "
    (key: replace/all key " " "-")
]

parse/all json [
    any [
        to {"} [
            and change token key
            ; next rule here, example:
            copy new-key thru ":" (probe new-key)
            | skip
        ]
    ]
]

这有点令人费解，因为我似乎无法像我期望的那样'改变工作方式（表现得像change，而不是change/part），但理论上你应该能够沿着这些行缩短它和有一个相当干净的规则。理想的可能是：

token: [
    {"} copy key to {"} skip any " " and ":"
    (key: replace/all key " " "-")
]

parse/all json [
    any [
        to {"} change token key
        | thru {"}
    ]
]

编辑：围绕change -

的另一个软糖

token: [
    and [{"} key: to {"} key.: skip any " " ":"]
    (key: replace/all copy/part key key. " " "-")
    remove to ":" insert key
]

parse/all json [
    any [to {"} [token | skip]]
]

Answer 2

另一种方法是将解析视为具有EBNF的编译器编译器。如果我正确地回忆起R2语法：

copy token [rule] (append output token)

假设语法正确，字符串中没有{"}：

thru {"} skip copy key to {"} skip
; we know ":" must be there, no check
thru {"} copy content to {"} skip
(append output rejoin[ {"} your-magic-with key {":"} content {"} ])

更精确，而不是to，char：char：

any space  {"} copy key some [ string-char | "\" skip ] {"} 
any space ":" any space {"} copy content any [ string-char  | "\" skip ] {"} 
(append output rejoin[ {"} your-magic-with key {":"} content {"} ])
; content can be empty -> any, key not -> some

除了string-char和{\}之外，

{"}是一个字符集，语法是什么？

不知道R3是否仍然像这样......： - /

Answer 3

由于其他人回答了parse问题，我将回答P.S。：

有一些建议的选项从未添加到replace，主要原因是处理选项有开销，而且这个函数已经需要一些有趣的优化来处理它已经拥有的选项。一旦我们改进了它的API，我们将尝试用本机替换该函数。它基本上与reword函数类似，我们直到最近才决定最终的API。对于replace，我们还没有讨论过。

在/part选项的情况下，之前没有任何人建议它，并且在概念上可能有点尴尬与现有的内部长度计算统一。有可能有一个有限的/part选项，只有整数而不是偏移引用。如果/part长度优先于内部计算的长度，则可能是最好的。尽管如此，如果我们最终使用调整后的API，则可能不需要/part选项。

PARSE方言应该用于从根本上修改输入的任务吗？

3 个答案: