如何用Rebol PARSE方言表达分支?

时间:2015-05-24 11:33:31

标签: parsing rebol rebol3

我有一个如下所示的mysql架构:

data: {
    `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
    `name` varchar(10) DEFAULT '' COMMENT 'the name',
    `content` text COMMENT 'something',
}

现在我想从中提取一些信息:提交的名称,类型和评论(如果有的话)。见下文:

["id" "int" "" "name" "varchar" "the name" "content" "text" "something" ]

我的代码是:

parse data [
    any [ 
        thru {`} copy field to {`} {`}
        thru some space copy field-type to [ {(} | space]
        (comm: "")
        opt [ thru {COMMENT} thru some space thru {'} copy comm to {'}]
        (repend temp field repend temp field-type either comm [ repend temp comm ][ repend temp ""])
    ]
]

但是我得到这样的东西:

["id" "int" "the name" "content" "text" "something"]

我知道opt ..行不对。

如果首先找到COMMENT关键字,我想要明确,然后提取评论信息;如果首先找到,那么继续下一个循环。但我不知道如何表达它。任何人都可以帮忙吗?

4 个答案:

答案 0 :(得分:5)

我非常支持(在可能的情况下)建立一组带有正项的语法规则来匹配目标输入 - 我发现它更有文化,更精确,更灵活,更容易调试。在上面的代码段中,我们可以确定五个核心组件:

space: use [space][
    space: charset "^-^/ "
    [some space]
]

word: use [letter][
    letter: charset [#"a" - #"z" #"A" - #"Z" "_"]
    [some letter]
]

id: use [letter][
    letter: complement charset "`"
    [some letter]
]

number: use [digit][
    digit: charset "0123456789"
    [some digit]
]

string: use [char][
    char: complement charset "'"
    [any [some char | "''"]]
]

在定义术语的情况下,编写描述输入语法的规则相对简单:

result: collect [
    parsed?: parse/all data [ ; parse/all for Rebol 2 compatibility
        opt space
        some [
            (field: type: none comment: copy "")
            "`" copy field id "`"
            space 
            copy type word opt ["(" number ")"]
            any [
                space [
                    "COMMENT" space "'" copy comment string "'"
                    | word | "'" string "'" | number
                ]
            ]
            opt space "," (keep reduce [field type comment])
            opt space
        ]
    ]
]

作为额外的奖励,我们可以验证输入。

if parsed? [new-line/all/skip result true 3]

new-line的一个小应用程序可以使某些东西变得聪明起来:

== [
    "id" "int" "" 
    "name" "varchar" "the name" 
    "content" "text" "something"
]

答案 1 :(得分:3)

我认为这更接近你所追求的目标。

data: {
    `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
    `name` varchar(10) DEFAULT '' COMMENT 'the name',
    `content` text COMMENT 'something',
}
temp: []
parse data [
  any [ 
    thru {`} copy field to {`} {`}
    some space copy field-type to [ {(} | space]
    (comm: copy "")
    opt [ thru {COMMENT} some space thru {'} copy comm to {'}]
    (repend temp field repend temp field-type either comm [ repend temp comm ][ repend temp ""])
  ]
]
probe temp

打破分歧。

  1. temp
  2. 设置一个带空栏的单词
  3. thru some space更改为some space,因为这将以相同的方式向前推进系列。请注意,以下是false

    parse "   " [ thru some space ]
    
  4. comm: ""更改为comm: copy ""以确保每次提取注释时都会获得一个新字符串(似乎不会影响输出,但这是一种很好的做法)

  5. 根据评论2将{COMMENT} thru some space更改为{COMMENT} some space
  6. 刚刚在末尾添加了一个用于调试的探测器
  7. 作为一个注释,您可以在解析规则中的任何位置使用??(几乎)来帮助调试,这将显示您当前的位置。

答案 2 :(得分:3)

解析/全部用于字符串解析

data: {
    `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
    `name` varchar(10) DEFAULT '' COMMENT 'the name',
    `content` text COMMENT 'something',
}
nodata:   charset { ()'}
dat: complement nodata

collect [   
    parse/all data [
        some [
            thru {`} copy field to {`} (keep field) skip 
            some " " copy type some dat ( keep type   comm:  copy "" )  
            copy rest thru "," (
                parse/all rest [
                    some [
                        [","   (keep comm) ]  
                     |  ["COMMENT"   some nodata copy comm to "'"  ]
                     |  skip                        
                    ]
                ]
            )
        ]
    ]
]
== ["id" "int" "" "name" "varchar" "the name" "content" "text" "something"]

使用纯解析的另一个(更好)解决方案

collect [   
    probe parse/all data [
        some [
            thru {`} copy field to {`} (keep field) skip 
            some " " copy type some dat ( keep type   comm:  ""  further: [])  
            some [ 
            ","   (keep comm  further:  [ to end  skip]) 
            |  ["COMMENT"   some nodata copy comm to "'"  ]
            |  skip  further                     
            ]
        ]
    ]
]

答案 3 :(得分:1)

我想出了另一种将数据作为块的方法!但不是字符串!。

data: read/lines data.txt
probe data
temp: copy []

foreach d data [
    parse d [ 
        thru {`} copy field to {`} {`}
        thru some space copy field-type to [ {(} | space]
        (comm: "")
        opt [ thru {COMMENT} thru some space thru {'} copy comm to {'}]
        (repend temp field repend temp field-type either comm [ repend temp comm ][ repend temp ""])
    ]
]

probe temp