有没有办法在idris光年库中编写chainl函数?

时间:2015-10-30 17:38:04

标签: regex haskell recursion idris lightyear

我试图在Idris中正式化基于正则表达式的字符串搜索工具 (当前状态here)。但我正在解决正则表达式的问题。我已经尝试构建一个小的解析库,但放弃了这个,转而使用Lightyear,这是一个用于Idris的解析组合库。

由于我已经习惯了Haskell,我尝试使用类似的策略,而不是使用Parsec。我的主要问题是如何在Lightyear解析器上处理左递归?我尝试了几种编码,但几乎所有解析器最终都会循环并在生成的代码中导致分段错误。

2 个答案:

答案 0 :(得分:4)

我不认识Lightyear,但我在将Parsec移植到Idris方面取得了一些成功:

module Parser

data Parser : Type -> Type where
     P : (String -> List (a, String)) -> Parser a

unP : Parser a -> String -> List (a, String)
unP (P f) = f

total stripPrefix : (Eq a) => List a -> List a -> Maybe (List a)
stripPrefix [] ys = Just ys
stripPrefix (x::xs) (y::ys) = if (x == y) then stripPrefix xs ys else Nothing
stripPrefix _ _  = Nothing

total token : String -> Parser ()
token tk = P $ \s => case stripPrefix (unpack tk) (unpack s) of
      Just s' => [((), pack s')]
      Nothing => []

total skip : Parser ()
skip = P $ \s => case unpack s of
     [] => []
     (_::s') => [((), pack s')]

instance Functor Parser where
  map f p = P $ \s => map (\(x, s') => (f x, s')) (unP p s)

instance Applicative Parser where
  pure x = P $ \s => [(x, s)]
  (P pf) <*> (P px) = P $ \s => concat (map (\(f, s') => map (\(x, s'') => (f x, s'')) (px s')) (pf s))

instance Alternative Parser where
  empty = P $ \s => []
  (P p1) <|> (P p2) = P $ \s => case p1 s of
     [] => p2 s
     results => results

instance Monad Parser where
  px >>= f = P $ \s => concat (map (\(x, s') => unP (f x) s') (unP px s))

total runParser : Parser a -> String -> Maybe a
runParser (P p) s = case p s of
  [(x, "")] => Just x
  _         => Nothing

这允许chainl

的直接复制粘贴实现
chainl1 : Parser a -> Parser (a -> a -> a) -> Parser a
chainl1 p op = p >>= rest
  where
    rest x = do { f <- op; y <- p; rest $ f x y } <|> return x

chainl : Parser a -> Parser (a -> a -> a) -> a -> Parser a
chainl p op x = chainl1 p op <|> return x

然后我们可以直接从chainl文档中对表达式解析器进行音译(我太懒了,无法实现正确的integer解析器,所以我们只使用一元) :

parens : Parser a -> Parser a
parens p = token "(" *> p <* token ")"

symbol : String -> Parser ()
symbol = token

integer : Parser Nat
integer = P $ \s => case unpack s of
     ('Z'::s') => [(Z, pack s')]
     ('S'::s') => map (\(n, s'') => (S n, s'')) $ unP integer (pack s')
     _ => []

mutual
    expr : Parser Nat
    expr = term   `chainl1` addop

    term : Parser Nat
    term = factor `chainl1` mulop

    factor : Parser Nat
    factor  = parens expr <|> integer

    mulop : Parser (Nat -> Nat -> Nat)
    mulop = (symbol "*" *> pure (*)) <|>
            (symbol "/" *> pure div)

    addop : Parser (Nat -> Nat -> Nat)
    addop = (symbol "+" *> pure (+)) <|>
            (symbol "-" *> pure (-))

现在,如果你试试这个:

main : IO ()
main = do
  s <- getLine
  printLn $ runParser expr s
然后它将具有你所观察到的相同的不同行为。但是,我们可以做两个小的改动:

  1. 介绍一个懒惰的替代组合子:

    orElse : Parser a -> Lazy (Parser a) -> Parser a
    orElse p1 p2 = P $ \s => case unP p1 s of
       [] => unP p2 s
       results => results
    
  2. 通过翻转两个备选方案,确保factor的递归部分(即parens expr部分)处于此惰性位置:

    factor = integer `orElse`  parens expr
    
  3. 然后按预期工作:

    13:06:07 [cactus@galaxy brainfuck]$ idris Expr.idr -o Expr
    13:06:27 [cactus@galaxy brainfuck]$ echo "SZ+(SSZ*SSSZ)" | ./Expr
    Just 7
    

答案 1 :(得分:0)

chainlchainl1组合器可与Lightyear包一起使用。但是,它们是默认提供的。我已将组合器添加到我自己的模块中,我需要它们:

chainl1 : Parser a -> Parser (a -> a -> a) -> Parser a
chainl1 p op = p >>= rest
  where rest a1 = (do f <- op
                      a2 <- p
                      rest (f a1 a2)) <|> pure a1

chainl : Parser a -> Parser (a -> a -> a) -> a -> Parser a
chainl p op a = (p `chainl1` op) <|> pure a

似乎工作正常。希望有所帮助。