FsLexYacc。使用F#进行词汇分析和解析

时间:2017-11-25 14:25:02

标签: parsing f# interpreter lexer

我正在尝试使用F#和FsLexYacc库来学习解释器和编译器的基础知识,但是我无法理解编写Lexer和Parser文件的原则......我正在关注this example,但是它为iterpreter使用了一些简单的SQL查询。我正在寻找的是如何使用F#将this grammar转换为工作的Lexer和Parser。 如果有帮助,我会包含我的AST,Lexer和Parser文件。

这是AST

module Ast


type TypeIdentifier =
    |Boolean of bool
    |Integer of int
    |Float of float
    |String of string

and BinaryOperators = 
    |Add
    |Subtract
    |Multiply
    |Equal
    |NotEqual
    |Less
    |Greater
    |LessEqual
    |GreaterEqual
    |Semicolon
    |Colon
    |Range
    |Assign

这是Lexer

{
module SqlLexer
open System
open SqlParser
open Microsoft.FSharp.Text.Lexing

let keywords = [

    "div", DIV;
    "or", OR;
    "and", AND;
    "not", NOT;
    "if", IF;
    "then", THEN;
    "else", ELSE;
    "of", OF;
    "while", WHILE;
    "do", DO;
    "array", ARRAY;
    "procedure", PROCEDURE;
    "program", PROGRAM;
    "begin", BEGIN;
    "end", END;
    "var", VAR

] |> Map.ofList

let ops = [

    "+", ADD;
    "-", SUBTRACT;
    "*", MULTIPLY;
    "=", EQUAL;
    "<>", NOTEQUAL;
    "<", LESS;
    "<=", LESSEQUAL;
    ">", GREATER;
    ">=", GREATEREQUAL;
    ":=", ASSIGN;
    ".", POINT;
    ",", COMMA;
    ";", SEMICOLON;
    ":", COLON;
    "..", RANGE;

] |> Map.ofList 
}

let char                = ['a'-'z' 'A'-'Z']
let digit               = ['0'-'9']
let int                 = '-'?digit+
let float               = '-'?digit+ '.' digit+
let identifier          = char(char|digit)*
let whitespece          = [' ' '\t']
let newline             = "\n\r" | '\n' | '\r'
let operator            = "+" | "-" | "*" | "=" | "<>" | "<" | "<=" | ">" | ">=" | ":=" | "." | "," | ";" | ":" | ".."

rule tokenize = parse
| whitespace        { tokenize lexbuf }
| newline           { lexbuf.EndPos <- lexbuf.EndPos.NextLine; tokenise lexbuf; }
| int               { INT(Int32.Parse(LexBuffer<_>.LexemeString lexbuf)) }
| float             { FLOAT(Double.Parse(LexBuffer<_>.LexemeString lexbuf) }
| operator          { ops.[LexBuffer<_>.LexemeString lexbuf] }
| identifier        { match keywords.TryFind(LexBuffer<_>.LexemeString lexbuf) with
                        | Some(token) -> token
                        | None -> ID(LexBuffer<_>.LexemeString lexbuf)}
| eof               { EOF }

这是我的Parser:

%{
open Sql
%}

%token <string> ID
%token <int> INT
%token <float> FLOAT
%token <bool> BOOL

%token DIV
%token AND OR NOT
%token IF THEN ELSE
%token WHILE DO
%token ARRAY OF
%token PROGRAM 
%token PRODEDURE
%token BEGIN END
%token VAR

%token ADD SUBTRACT MULTIPLY
%token EQUAL NOTEQUAL
%token LESS LESSEQUAL
%token GREATER GREATEREQUAL
%token ASSIGN
%token POINT COMMA RANGE
%token SEMICOLON COLON

%start start 

start: 
    PROGRAM ID ;
    block .
    EOF {
        identifier = {$2}
        block = {$4}
    }

block:
    |variableDeclarationPart procedureDeclaretionPart statementPart {$1, $2, $3}

variableDeclarationPart:
    |   {}
    |VAR variableDeclaration ; {variableDeclaration;} {}

我不是在寻找带有编写代码的答案,我想用一些类似的例子或使用FsLexYacc库解释编程语言的教程来解释,如Pascal

1 个答案:

答案 0 :(得分:1)

我很久以前就一直在努力解决这个问题。我最终做的是,因为FsLexYacc库是基于lex和yacc我已经通过了关于裸lex和yacc(或flex&amp; bison)的快速教程,除了使用C语法而不是使用C语法我将其转换为F#并在我的测试项目中使用它。 在完成了我经历过的概念之后 https://github.com/fsprojects/FsLexYacc/blob/master/docs/content/jsonParserExample.md以及我自己实施的一些深奥的语言。

希望这会有所帮助。

PS:F#编译器是开源的,它使用的是FsLexYacc,所以你也可以尝试读取这些内容。