Question

我尝试解析函数的调用，这是变体：

add 8 2
add x y
add (inc x) (dec y)
funcWithoutArgs

根据我在代码中如何分配分析器，以及可能如何编码，我会得到错误以及成功但不想要的分析。例如，这：

add 4 7

返回以下AST：

[Call ("foo",[Number 4]);
 Number 7]

因此，他仅采用第一个参数。

当我这样做时：

foo x y

他将AST发送给我

[Call ("foo",[Call ("x",[Call ("y",[])])])]

那不是我想要的，因为在这里，每个参数都将下一个作为参数调用。

另一个例子，当我这样做时：

foo x y
inc x

我得到：

[Call ("foo",[Call ("x",[Call ("y",[Call ("inc",[Call ("x",[])])])])])]

它的作用与上面相同，但也调用了此行之后的代码。当我问分析器换行时（请参见代码），它将向我发送以下消息：

[Call ("foo",[]); Call ("x",[]); Call ("y",[]); Call ("inc",[]); Call ("x",[])]

即使放在方括号中也不起作用：

foo (x) (y)

给予：

[Call ("foo",[]); Call ("x",[]); Call ("y",[])]

并且：

add (inc x) (dec y)

给予：

Error in Ln: 1 Col: 1
Note: The error occurred on an empty line.

The parser backtracked after:
  Error in Ln: 2 Col: 5
  add (inc x) (dec y)
      ^
  Expecting: end of input or integer number (32-bit, signed)

  The parser backtracked after:
    Error in Ln: 2 Col: 10
    add (inc x) (dec y)
             ^
    Expecting: ')'

[]

简而言之，我的函数调用分析器无法正常工作。每次我更改某项内容（例如换行，尝试或更改其他层次结构）时，某些内容都不起作用... 你有任何想法如何解决这个非常恼人的问题呢？

这是所使用的最低功能代码：

open FParsec

// Ast

type Expression =
    | Number of int
    | Call of string * Expression list

type Program = Expression list

// Tools

let private bws p =
    spaces >>? p .>>? spaces

let private suiteOf p =
    sepEndBy p spaces1

let inline private betweenParentheses p label =
    between (pstring "(") (pstring ")") p
    <?> (label + " between parentheses")

let private identifier =
    many1Satisfy2 isLetter (fun c -> isLetter c)

// Expressions

let rec private call = parse {
        let! call = pipe2 (spaces >>? identifier) (spaces >>? parameters)
                        (fun id parameters -> Call(id, parameters)) // .>>? newline
        return call
    }

and private parameters = suiteOf expression

and private callFuncWithoutArgs =
    identifier |>> fun id -> Call(id, [])

and private number = pint32 |>> Number

and private betweenParenthesesExpression =
    parse { let! ex = betweenParentheses expression "expression"
            return ex }

and private expression =
    bws (attempt betweenParenthesesExpression <|>
         attempt number <|>
         attempt call <|>
         callFuncWithoutArgs)

// -------------------------------

let parse code =
    let parser = many expression .>>? eof

    match run parser code with
        | Success(result, _, _) -> result
        | Failure(msg, _, _) ->
            printfn "%s" msg
            []

System.Console.Clear()

parse @"
add 4 7

foo x y

inc x

foo (x) (y)

add (inc x) (dec y)

" |> printfn "%A"

Answer 1

您的主要问题是解析器的高级设计错误。

您当前的设计是表达式可以是：

括号之间的表达式（可以说是“子表达式”）（在这里没问题）
一个数字（这里没有问题）
带参数的调用，它是一个标识符，后跟一个用空格分隔的表达式列表（这是问题的主要部分）
没有参数的调用，它是一个标识符（这会引起问题）

看看表达式foo x y，让我们按照解析器的顺序应用这些规则。没有括号，foo不是数字，所以它是3或4。首先我们尝试3。foo后跟x y：x y解析为一种表达？为什么呢，是的：它解析为带有参数的调用，其中x是函数，y是参数。由于x y匹配3，因此它会根据规则3进行分析而不会检查规则4，因此foo x y类似于foo (x y)的匹配将是：使用单个参数调用foo是使用参数x对y的调用。

如何解决此问题？好吧，您可以尝试交换3和4的顺序，以便在不带参数的调用之前检查不带参数的函数调用（这会使x y解析为x。但这会失败，因为foo x y会与foo匹配。因此在此处将规则4放在规则3之前是行不通的。

真正的解决方案是将表达式的规则分为两个级别。我称之为“价值”的“内部”水平可能是：

括号之间的表达式
一个数字
不带参数的函数调用

“外部”级别（表达式的解析规则）为：

带有参数的函数调用，它们都是值，不是表达式

一个值

请注意，这些解析级别是相互递归的，因此您需要在实现中使用createParserForwardedToRef。让我们看一下如何用这种设计解析foo x y：

首先，foo解析为一个标识符，因此检查它是否可以是带有参数的函数调用。 x是否解析为值？是的，根据价值规则3。 y是否解析为值？是的，根据价值规则3。因此foo x y解析为一个函数调用。

现在funcWithoutParameters呢？它将使表达式规则1失败，因为它后面没有参数列表。因此，将检查表达式的规则2，然后在值的规则3下进行匹配。

好的，对伪代码进行基本的完整性检查是可行的，因此让我们将其转换为代码。但是首先，我将在您的解析器中提到一个我尚未提到的 other 问题，这是您没有意识到FParsec spaces解析器also matches newlines 。因此，当您将expression解析器包装在bws中（“在空白之间”）时，它也将在解析文本后使用换行符。因此，当您解析诸如以下内容时：

foo a b inc c

suiteOf expression将看到列表a b inc c，并将所有这些都转换为foo的参数。在下面的代码中，我区分了FParsec的spaces解析器（包括换行符）和仅解析水平空格（空格和制表符而不是换行符）的解析器，并在适当的位置使用了每个地点。以下代码实现了我在此答案中提到的设计，对于您编写的所有测试表达式，其输出对我而言都是正确的：

open FParsec // Ast type Expression = | Number of int | Call of string * Expression list type Program = Expression list // Tools let private justSpaces = skipMany (pchar ' ' <|> pchar '\t') let private justSpaces1 = skipMany1 (pchar ' ' <|> pchar '\t') let private bws p = spaces >>? p .>>? spaces let private suiteOf p = sepEndBy1 p (justSpaces1) let inline private betweenParentheses p label = between (pstring "(") (pstring ")") p <?> (label + " between parentheses") let private identifier = many1Satisfy2 isLetter (fun c -> isLetter c) // Expressions let private expression, expressionImpl = createParserForwardedToRef() let private betweenParenthesesExpression = parse { let! ex = betweenParentheses expression "expression" return ex } let private callFuncWithoutArgs = (identifier |>> fun id -> Call(id, [])) let private number = pint32 |>> Number let private value = justSpaces >>? (attempt betweenParenthesesExpression <|> attempt number <|> callFuncWithoutArgs) let private parameters = suiteOf value let rec private callImpl = parse { let! call = pipe2 (justSpaces >>? identifier) (justSpaces >>? parameters) (fun id parameters -> Call(id, parameters)) return call } let call = callImpl expressionImpl.Value <- bws (attempt call <|> value) // ------------------------------- let parse code = let parser = many expression .>>? (spaces >>. eof) match run parser code with | Success(result, _, _) -> result | Failure(msg, _, _) -> printfn "%s" msg [] System.Console.Clear() parse @" add 4 7 foo x y inc x foo (x) (y) add (inc x) (dec y) " |> printfn "%A"

P.S。我使用了http://www.quanttec.com/fparsec/users-guide/debugging-a-parser.html建议的以下运算符来极大地帮助我查找问题：

let (<!>) (p: Parser<_,_>) label : Parser<_,_> = fun stream -> printfn "%A: Entering %s" stream.Position label let reply = p stream printfn "%A: Leaving %s (%A)" stream.Position label reply.Status reply

用法：将let parseFoo = ...变成let parseFoo = ... <!> "foo"。然后，您将在控制台中获得调试输出流，如下所示：

(Ln: 2, Col: 20): Entering expression (Ln: 3, Col: 1): Entering call (Ln: 3, Col: 5): Entering parameters (Ln: 3, Col: 5): Entering bwParens (Ln: 3, Col: 5): Leaving bwParens (Error) (Ln: 3, Col: 5): Entering number (Ln: 3, Col: 6): Leaving number (Ok) (Ln: 3, Col: 7): Entering bwParens (Ln: 3, Col: 7): Leaving bwParens (Error) (Ln: 3, Col: 7): Entering number (Ln: 3, Col: 8): Leaving number (Ok) (Ln: 3, Col: 8): Leaving parameters (Ok) (Ln: 3, Col: 8): Leaving call (Ok) (Ln: 3, Col: 8): Leaving expression (Ok)

当您试图弄清为什么解析器没有按预期运行时，这很有帮助。

解析器函数的调用 - FParsec

1 个答案: