Question

我有一个如下所示的mysql架构：

data: {
    `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
    `name` varchar(10) DEFAULT '' COMMENT 'the name',
    `content` text COMMENT 'something',
}

现在我想从中提取一些信息：提交的名称，类型和评论（如果有的话）。见下文：

["id" "int" "" "name" "varchar" "the name" "content" "text" "something" ]

我的代码是：

parse data [
    any [ 
        thru {`} copy field to {`} {`}
        thru some space copy field-type to [ {(} | space]
        (comm: "")
        opt [ thru {COMMENT} thru some space thru {'} copy comm to {'}]
        (repend temp field repend temp field-type either comm [ repend temp comm ][ repend temp ""])
    ]
]

但是我得到这样的东西：

["id" "int" "the name" "content" "text" "something"]

我知道opt ..行不对。

如果首先找到COMMENT关键字，我想要明确，然后提取评论信息;如果首先找到，那么继续下一个循环。但我不知道如何表达它。任何人都可以帮忙吗？

Answer 1

我非常支持（在可能的情况下）建立一组带有正项的语法规则来匹配目标输入 - 我发现它更有文化，更精确，更灵活，更容易调试。在上面的代码段中，我们可以确定五个核心组件：

space: use [space][
    space: charset "^-^/ "
    [some space]
]

word: use [letter][
    letter: charset [#"a" - #"z" #"A" - #"Z" "_"]
    [some letter]
]

id: use [letter][
    letter: complement charset "`"
    [some letter]
]

number: use [digit][
    digit: charset "0123456789"
    [some digit]
]

string: use [char][
    char: complement charset "'"
    [any [some char | "''"]]
]

在定义术语的情况下，编写描述输入语法的规则相对简单：

result: collect [
    parsed?: parse/all data [ ; parse/all for Rebol 2 compatibility
        opt space
        some [
            (field: type: none comment: copy "")
            "`" copy field id "`"
            space 
            copy type word opt ["(" number ")"]
            any [
                space [
                    "COMMENT" space "'" copy comment string "'"
                    | word | "'" string "'" | number
                ]
            ]
            opt space "," (keep reduce [field type comment])
            opt space
        ]
    ]
]

作为额外的奖励，我们可以验证输入。

if parsed? [new-line/all/skip result true 3]

new-line的一个小应用程序可以使某些东西变得聪明起来：

== [
    "id" "int" "" 
    "name" "varchar" "the name" 
    "content" "text" "something"
]

Answer 2

我认为这更接近你所追求的目标。

data: {
    `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
    `name` varchar(10) DEFAULT '' COMMENT 'the name',
    `content` text COMMENT 'something',
}
temp: []
parse data [
  any [ 
    thru {`} copy field to {`} {`}
    some space copy field-type to [ {(} | space]
    (comm: copy "")
    opt [ thru {COMMENT} some space thru {'} copy comm to {'}]
    (repend temp field repend temp field-type either comm [ repend temp comm ][ repend temp ""])
  ]
]
probe temp

打破分歧。

为temp
将thru some space更改为some space，因为这将以相同的方式向前推进系列。请注意，以下是false
```
parse "   " [ thru some space ]
```
将comm: ""更改为comm: copy ""以确保每次提取注释时都会获得一个新字符串（似乎不会影响输出，但这是一种很好的做法）
根据评论2将{COMMENT} thru some space更改为{COMMENT} some space。
刚刚在末尾添加了一个用于调试的探测器

作为一个注释，您可以在解析规则中的任何位置使用??（几乎）来帮助调试，这将显示您当前的位置。

Answer 3

解析/全部用于字符串解析

data: {
    `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
    `name` varchar(10) DEFAULT '' COMMENT 'the name',
    `content` text COMMENT 'something',
}
nodata:   charset { ()'}
dat: complement nodata

collect [   
    parse/all data [
        some [
            thru {`} copy field to {`} (keep field) skip 
            some " " copy type some dat ( keep type   comm:  copy "" )  
            copy rest thru "," (
                parse/all rest [
                    some [
                        [","   (keep comm) ]  
                     |  ["COMMENT"   some nodata copy comm to "'"  ]
                     |  skip                        
                    ]
                ]
            )
        ]
    ]
]
== ["id" "int" "" "name" "varchar" "the name" "content" "text" "something"]

使用纯解析的另一个（更好）解决方案

collect [   
    probe parse/all data [
        some [
            thru {`} copy field to {`} (keep field) skip 
            some " " copy type some dat ( keep type   comm:  ""  further: [])  
            some [ 
            ","   (keep comm  further:  [ to end  skip]) 
            |  ["COMMENT"   some nodata copy comm to "'"  ]
            |  skip  further                     
            ]
        ]
    ]
]

Answer 4

我想出了另一种将数据作为块的方法！但不是字符串！。

data: read/lines data.txt
probe data
temp: copy []

foreach d data [
    parse d [ 
        thru {`} copy field to {`} {`}
        thru some space copy field-type to [ {(} | space]
        (comm: "")
        opt [ thru {COMMENT} thru some space thru {'} copy comm to {'}]
        (repend temp field repend temp field-type either comm [ repend temp comm ][ repend temp ""])
    ]
]

probe temp

如何用Rebol PARSE方言表达分支？

4 个答案: