如何在Haskell中将S表达式解析为数据结构?

时间:2018-10-11 07:29:38

标签: parsing haskell

我是Haskell的新手,可以使用一些指导。

挑战:获取S表达式并将其解析为记录。

我成功的地方:我可以获取一个文件,并将其读入已解析的String中。 但是,使用将Text解析为DFA s.t

INSERT INTO product_mysql (vendor_nbr,vendor_dept_nbr,vendor_seq_nbr) VALUES (7786,3,0);
Query OK, 1 row affected (0.339 sec)

返回此错误:

 let 
        toDFA :: [L.Text] -> EntryDFA
        toDFA t =
           let [q,a,d,s,f] = t
           in EntryDFA { 
               state = read q
              ,alpha = read a
              ,delta = read d
              ,start = read s
              ,final = read f }

必须有一种更惯用的方法。

1 个答案:

答案 0 :(得分:0)

read是类型为Read a => String -> a的部分函数,​​它在解析失败时引发异常。通常,您想避免使用它(如果有字符串,请使用readMaybe)。 StringL.Text是不同的类型,这就是为什么会出错的原因。

您的示例代码在)之后丢失了另外的trans-func

我正在使用Megaparsec软件包,该软件包提供了一种使用解析器组合器的简便方法。该库的作者写了更长的教程here

基本思想是Parser a是可以解析a类型的值的类型。在Text.Megaparsec中,您可以使用几个函数(parseparseMaybe等),以对“字符串”数据类型(例如String)运行解析器。或严格/懒惰的Text)。

当对do使用IO表示法时,表示“先执行一项操作”。同样,您可以将do表示法与Parser一起使用,它的意思是“解析这件事,然后解析下件事”。

p1 *> p2表示运行解析器p1,运行p2并返回运行p2的结果。 p1 <* p2表示运行解析器p1,运行p2并返回运行p1的结果。您也可以在Hoogle上查找文档,以防难以理解。

{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE NamedFieldPuns    #-}

-- In practice, many of these imports would be unqualified, but I've
-- opted for explicitness for clarity.
import Control.Applicative (empty, many, some, (<*), (*>))
import Control.Exception (try, IOException)
import Data.Maybe (fromMaybe)
import Data.Set (Set)
import Data.Text (Text)

import qualified Data.Set as Set
import qualified Data.Text as T
import qualified Data.Text.IO as TIO
import qualified Text.Megaparsec as MP
import qualified Text.Megaparsec.Char as MPC
import qualified Text.Megaparsec.Char.Lexer as MPCL

type Q = Text
type E = Char

data EntryDFA = EntryDFA
  { state :: Set Q
  , alpha :: Set E
  , delta :: Set (Q,E,Q)
  , start :: Q
  , final :: Set Q
  } deriving Show

inputFile = "foo.sexp"

main :: IO ()
main = do
  -- read file and check for exception instead of checking if
  -- it exists and then trying to read it
  result <- try (TIO.readFile inputFile)
  case result of
    Left e -> print (e :: IOException)
    Right txt -> do
      case MP.parse dfaParser inputFile txt of
        Left e -> print e
        Right dfa -> print dfa

type Parser = MP.Parsec () Text

-- There are no comments in the S-exprs, so leave those empty
spaceConsumer :: Parser ()
spaceConsumer = MPCL.space MPC.space1 empty empty

symbol :: Text -> Parser Text
symbol txt = MPCL.symbol spaceConsumer txt

parens :: Parser a -> Parser a
parens p = MP.between (symbol "(") (symbol ")") p

setP :: Ord a => Parser a -> Parser (Set a)
setP p = do
  items <- parens (p `MP.sepBy1` (symbol ","))
  return (Set.fromList items)

pair :: Parser a -> Parser b -> Parser (a, b)
pair p1 p2 = parens $ do
  x1 <- p1
  x2 <- symbol "," *> p2
  return (x1, x2)

stateP :: Parser Text
stateP = do
  c <- MPC.letterChar
  cs <- many MPC.alphaNumChar
  return (T.pack (c:cs))

dfaParser :: Parser EntryDFA
dfaParser = do
  () <- spaceConsumer
  (_, state) <- pair (symbol "states") (setP stateP)
  (_, alpha) <- pair (symbol "alpha") (setP alphaP)
  (_, delta) <- pair (symbol "trans-func") (setP transFuncP)
  (_, start) <- pair (symbol "start") valP
  (_, final) <- pair (symbol "final") (setP valP)
  return (EntryDFA {state, alpha, delta, start, final})
  where
    alphaP :: Parser Char
    alphaP = MPC.letterChar <* spaceConsumer
    transFuncP :: Parser (Text, Char, Text)
    transFuncP = parens $ do
      s1 <- stateP
      a <- symbol "," *> alphaP
      s2 <- symbol "," *> stateP
      return (s1, a, s2)
    valP :: Parser Text
    valP = fmap T.pack (some MPC.digitChar)