按字符串拆分字符串

时间:2014-06-03 02:07:03

标签: string parsing haskell

我正在尝试使用Haskell构建一个词法分析器,我需要构造一个函数f :: String -> String -> (String, String),它将输入分成一个字符串元组。该函数需要通过已知的String进行拆分。已知的String始终位于要解析的String的开头。

例如,如果要拆分的String是“expression”并且输入是“Expression of String”,我需要生成一个输出(“表达式”,“其余的String”)。

我正在努力避免使用高级库。

任何帮助?

感谢。

2 个答案:

答案 0 :(得分:3)

你去了:

f :: String -> String -> [String]

f sep src = f_ src []
  where
    f_ [] acc = [reverse acc]
    f_ (x:xs) acc =
      case stripPrefix sep (x:xs) of
        Nothing  -> f_ xs (x:acc)
        Just rem -> [reverse acc] ++ [sep] ++ f_ rem []

答案 1 :(得分:1)

您还需要以某种方式处理失败,例如通过返回一个Maybe。您似乎也希望在余数的开头消除空格?

import Data.Char (isSpace)
import Data.List (stripPrefix)

f :: String -> String -> Maybe String
f prefix = fmap (dropWhile isSpace) . stripPrefix prefix

g :: String -> String -> (String, Maybe String)
g prefix str = (prefix, f prefix str)

h :: String -> String -> Maybe (String, String)
h prefix str = case f prefix str of 
  Nothing -> Nothing
  Just rest -> Just (prefix, rest)


-- >>> f "expression" "expression rest of string"
-- Just "rest of string"
-- >>> g "expression" "expression rest of string"
-- ("expression",Just "rest of string")
-- >>> h "expression" "expression rest of string"
-- Just ("expression","rest of string")

-- >>> f "expression" "xpression rest of string"
-- Nothing
-- >>> g "expression" "xpression rest of string"
-- ("expression",Nothing)
-- >>> h "expression" "xpression rest of string"
-- Nothing

这有点像大多数解析器库中的string解析器,其类型类似于String -> Parser String - 它返回字符串,如果找到它,但如果它没有,则会失败。最简单的版本附带ghc并且不是一个高级库'

 >>> import Text.ParserCombinators.ReadP
 >>> let apply = readP_to_S 
 >>> apply (string "expression") "expression rest of string"
 [("expression"," rest of string")]
 >>> apply (string "expression") "xpression rest of string"
 []
 >>> let myParser = string "expression" >> skipSpaces
 >>> apply myParser "expression rest of string"
 [((),"rest of string")]
 >>> apply myParser "xpression rest of string"
 []
 >>> let myBetterParser = do {exp <- string "expression" ; skipSpaces ; return exp}
 >>> apply myBetterParser "expression rest of string"
 [("expression","rest of string")]
 >>> apply myBetterParser "xpression rest of string"
 []