我想将给定的String拆分为文本和占位符。
例如:"Hello {xyz}, how are you?"
到[Text "Hello ", Placeholder "xyz", Text ", how are you?"]
下面的代码是我到目前为止编写的,但是问题是我的String会一个字母一个字母地返回。
data Token = Text String
| Placeholder String
deriving (Eq, Show)
tokenize :: String -> [Token]
tokenize [] = []
tokenize [x] = [Text [x]]
tokenize (x:xs) | x /= '{' && x /= '}' = Text [x]:(tokenize xs)
| otherwise = Placeholder [x]:(tokenize xs)
这就是我要回来的东西:
*Main> tokenize "Hello {xyz} fabian"
[Text "H",Text "e",Text "l",Text "l",Text "o",Text " ",Placeholder "{",Text "x",Text "y",Text "z",Placeholder "}",Text " ",Text "f",Text "a",Text "b",Text "i",Text "a",Text "n"]
答案 0 :(得分:0)
按原样,您正在手动处理文本。您还可以使用更强大的工具,例如正则表达式。功能 splitRegex 浮现在脑海。 See Text.Regex description
由于只有一种类型的占位符,您可以像这样进行操作:
import Text.Regex
data Token = Text String
| PlaceHolder String
deriving (Eq, Show)
makePairList1 :: String -> [(String, String)]
makePairList1 phrase =
let re1 = mkRegex "[{}]"
ch0 = head phrase
tags = case ch0 of '}' -> error "Phrase start with closing brace."
_ -> [ "TX", "PH" ]
in zip (cycle tags) (splitRegex re1 phrase)
tokenMaker :: (String, String) -> Token
tokenMaker (tag, text) = if (tag=="PH") then PlaceHolder text
else Text text
tokenize :: String -> [Token]
tokenize phrase = map tokenMaker (makePairList1 phrase)
main = do
let words = "Hello {who}, how are you {when} ?"
putStrLn $ "Words are: " ++ words
let pList1 = makePairList1 words
putStrLn $ "Pair list = " ++ (show pList1)
let tokenList = tokenize words
putStrLn $ "Token list: " ++ (show tokenList)
请注意,字符串对的中间列表主要用于演示和调试。
如果以后再使用更复杂的语法,而正则表达式不再使用它,则可以使用功能更强大的工具,例如Haskell解析器生成器Parsec。 Description of Parsec