我很难为树结构实现Read。我想采用像ABC(DE)F
这样的左关联字符串(带有parens)并将其转换为树。该特定示例对应于树
这是我正在使用的数据类型(虽然我愿意接受建议):
data Tree = Branch Tree Tree | Leaf Char deriving (Eq)
在Haskell中,那个特定的树会是:
example = Branch (Branch (Branch (Branch (Leaf 'A')
(Leaf 'B'))
(Leaf 'C'))
(Branch (Leaf 'D')
(Leaf 'E')))
(Leaf 'F')
我的show
功能如下:
instance Show Tree where
show (Branch l r@(Branch _ _)) = show l ++ "(" ++ show r ++ ")"
show (Branch l r) = show l ++ show r
show (Leaf x) = [x]
我想制作一个read
函数,以便
read "ABC(DE)F" == example
答案 0 :(得分:13)
这种情况下,使用解析库会使代码非常短且极具表现力。 (当我试图回答这个时,我很惊讶它是所以整洁!)
我将使用Parsec(该文章提供了一些链接以获取更多信息),并在“应用模式”(而不是monadic)中使用它,因为我们不需要额外的功率/英尺单子的射击能力。
首先是各种进口和定义:
import Text.Parsec
import Control.Applicative ((<*), (<$>))
data Tree = Branch Tree Tree | Leaf Char deriving (Eq, Show)
paren, tree, unit :: Parsec String st Tree
现在,树的基本单位是单个字符(不是括号)或带括号的树。带括号的树只是(
和)
之间的普通树。而正常的树只是左边相关的分支单元(它非常自我递归)。在Haskell与Parsec:
-- parenthesised tree or `Leaf <character>`
unit = paren <|> (Leaf <$> noneOf "()") <?> "group or literal"
-- normal tree between ( and )
paren = between (char '(') (char ')') tree
-- all the units connected up left-associatedly
tree = foldl1 Branch <$> many1 unit
-- attempt to parse the whole input (don't short-circuit on the first error)
onlyTree = tree <* eof
(是的,这就是整个解析器!)
如果我们愿意,我们可以不使用paren
和unit
,但上面的代码非常具有表现力,所以我们可以保留原样。
作为简要说明(我提供了文档的链接):
(<|>)
基本上是指“左解析器或右解析器”; (<?>)
可让您制作更好的错误消息; noneOf
将解析不在给定字符列表中的任何内容; between
需要三个解析器,并且只要它被第一个和第二个解析器分隔,就会返回第三个解析器的值; char
从字面上解析其论点。many1
将一个或多个参数解析为一个列表(似乎空字符串无效,因此many1
,而不是many
解析零或更多); eof
匹配输入的结尾。我们可以使用parse
函数来运行解析器(它返回Either ParseError Tree
,Left
是错误,Right
是正确的解析。)
read
将它用作类似read
的函数可能类似于:
read' str = case parse onlyTree "" str of
Right tr -> tr
Left er -> error (show er)
(我使用read'
来避免与Prelude.read
发生冲突;如果您想要Read
个实例,则需要做更多工作来实施readPrec
(或任何需要的东西)但实际解析已经完成不应该太难。)
一些基本的例子:
*Tree> read' "A"
Leaf 'A'
*Tree> read' "AB"
Branch (Leaf 'A') (Leaf 'B')
*Tree> read' "ABC"
Branch (Branch (Leaf 'A') (Leaf 'B')) (Leaf 'C')
*Tree> read' "A(BC)"
Branch (Leaf 'A') (Branch (Leaf 'B') (Leaf 'C'))
*Tree> read' "ABC(DE)F" == example
True
*Tree> read' "ABC(DEF)" == example
False
*Tree> read' "ABCDEF" == example
False
演示错误:
*Tree> read' ""
***Exception: (line 1, column 1):
unexpected end of input
expecting group or literal
*Tree> read' "A(B"
***Exception: (line 1, column 4):
unexpected end of input
expecting group or literal or ")"
最后,tree
和onlyTree
:
*Tree> parse tree "" "AB)CD" -- success: ignores ")CD"
Right (Branch (Leaf 'A') (Leaf 'B'))
*Tree> parse onlyTree "" "AB)CD" -- fail: can't parse the ")"
Left (line 1, column 3):
unexpected ')'
expecting group or literal or end of input
答案 1 :(得分:6)
这非常像堆栈结构。当您遇到输入字符串"ABC(DE)F"
时,您Leaf
找到任何原子(非括号)并将其放入累加器列表中。如果列表中有2个项目,则Branch
将它们放在一起。这可以用类似的东西来完成(注意,未经测试,仅包括提出想法):
read' [r,l] str = read' [Branch l r] str
read' acc (c:cs)
-- read the inner parenthesis
| c == '(' = let (result, rest) = read' [] cs
in read' (result : acc) rest
-- close parenthesis, return result, should be singleton
| c == ')' = (acc, cs)
-- otherwise, add a leaf
| otherwise = read' (Leaf c : acc) cs
read' [result] [] = (result, [])
read' _ _ = error "invalid input"
这可能需要进行一些修改,但我认为这足以让你走上正轨。
答案 2 :(得分:3)
dbaupp的parsec答案很容易理解。作为“低级”方法的一个例子,这里是一个手写的解析器,它使用成功延续来处理左关联树构建:
instance Read Tree where readsPrec _prec s = maybeToList (readTree s)
type TreeCont = (Tree,String) -> Maybe (Tree,String)
readTree :: String -> Maybe (Tree,String)
readTree = read'top Just where
valid ')' = False
valid '(' = False
valid _ = True
read'top :: TreeCont -> String -> Maybe (Tree,String)
read'top acc s@(x:ys) | valid x =
case ys of
[] -> acc (Leaf x,[])
(y:zs) -> read'branch acc s
read'top _ _ = Nothing
-- The next three are mutually recursive
read'branch :: TreeCont -> String -> Maybe (Tree,String)
read'branch acc (x:y:zs) | valid x = read'right (combine (Leaf x) >=> acc) y zs
read'branch _ _ = Nothing
read'right :: TreeCont -> Char -> String -> Maybe (Tree,String)
read'right acc y ys | valid y = acc (Leaf y,ys)
read'right acc '(' ys = read'branch (drop'close >=> acc) ys
where drop'close (b,')':zs) = Just (b,zs)
drop'close _ = Nothing
read'right _ _ _ = Nothing -- assert y==')' here
combine :: Tree -> TreeCont
combine build (t, []) = Just (Branch build t,"")
combine build (t, ys@(')':_)) = Just (Branch build t,ys) -- stop when lookahead shows ')'
combine build (t, y:zs) = read'right (combine (Branch build t)) y zs