显示拆分的字符串而不是合并的字符串,不知道为什么

时间:2019-05-26 20:01:46

标签: haskell recursion token

我想将给定的String拆分为文本和占位符。

例如:"Hello {xyz}, how are you?"[Text "Hello ", Placeholder "xyz", Text ", how are you?"]

下面的代码是我到目前为止编写的,但是问题是我的String会一个字母一个字母地返回。

data Token = Text String
           | Placeholder String
             deriving (Eq, Show)

tokenize :: String -> [Token]
tokenize [] = []
tokenize [x] = [Text [x]]
tokenize (x:xs) | x /= '{'  && x /= '}' = Text [x]:(tokenize xs)
                | otherwise             = Placeholder [x]:(tokenize xs)

这就是我要回来的东西:

*Main> tokenize "Hello {xyz} fabian"
[Text "H",Text "e",Text "l",Text "l",Text "o",Text " ",Placeholder "{",Text "x",Text "y",Text "z",Placeholder "}",Text " ",Text "f",Text "a",Text "b",Text "i",Text "a",Text "n"]

1 个答案:

答案 0 :(得分:0)

按原样,您正在手动处理文本。您还可以使用更强大的工具,例如正则表达式。功能 splitRegex 浮现在脑海。 See Text.Regex description

由于只有一种类型的占位符,您可以像这样进行操作:

import  Text.Regex

data Token = Text String
           | PlaceHolder String
             deriving (Eq, Show)

makePairList1 :: String -> [(String, String)]
makePairList1 phrase = 
    let  re1  = mkRegex "[{}]"
         ch0  = head phrase
         tags = case ch0 of '}'  ->   error "Phrase start with closing brace."
                            _    ->   [ "TX", "PH" ]
    in   zip (cycle tags)  (splitRegex re1 phrase)

tokenMaker :: (String, String) -> Token
tokenMaker (tag, text) = if (tag=="PH") then PlaceHolder text
                                        else Text text

tokenize :: String -> [Token]
tokenize phrase = map tokenMaker (makePairList1 phrase)


main = do
    let  words = "Hello {who}, how are you {when} ?"
    putStrLn $ "Words are: " ++ words
    let  pList1 = makePairList1 words
    putStrLn $ "Pair list = " ++ (show pList1)
    let  tokenList = tokenize words
    putStrLn $ "Token list: " ++ (show tokenList)

请注意,字符串对的中间列表主要用于演示和调试。

如果以后再使用更复杂的语法,而正则表达式不再使用它,则可以使用功能更强大的工具,例如Haskell解析器生成器Parsec。 Description of Parsec