Question

我正在使用ocamllex在OCaml中编写Python解释器，并且为了处理基于缩进的语法，我想

使用ocamllex对输入进行标记
遍历词法标记的列表，并根据解析器的需要插入INDENT和DEDENT标记
将此列表解析为AST

但是，在ocamllex中，词法化步骤会生成一个lexbuf流，该流很难通过迭代来进行缩进检查。是否有一种很好的方法从lexbuf中提取令牌列表，即

let lexbuf = (Lexing.from_channel stdin) in
let token_list = tokenize lexbuf

token_list的类型为Parser.token list？我的技巧是定义一个简单的解析器，例如

tokenize: /* used by the parser to read the input into the indentation function */
  | token EOL { $1 @ [EOL] }
  | EOL { SEP :: [EOL] }

token:
  | COLON { [COLON] }
  | TAB { [TAB] }
  | RETURN { [RETURN] }
   ...
  | token token %prec RECURSE { $1 @ $2 }

并这样称呼

    let lexbuf = (Lexing.from_channel stdin) in
    let temp = (Parser.tokenize Scanner.token) lexbuf in (* char buffer to token list *)

但这有各种问题，如移位减少错误和不必要的复杂性。有没有更好的方法在OCaml中编写lexbuf-> Parser.token列表函数？

从OCamllex lexbuf提取令牌列表

0 个答案: