Question

Haskell有一些使用\转义序列的string literals。其中包括\n，\t，\NUL。

如果我有字符串文字：

let s = "Newline: \\n Tab: \\t"

如何定义将上述字符串转换为：

的函数escape :: String -> String

"Newline: \n Tab: \t"

与所有其他字符串文字转义序列相同。

我可以使用Quasi Quoting和Template Haskell，但不知道如何使用它们来实现结果。有什么指针吗？

更新：我刚刚找到了基础库中包含的Text.ParserCombinators.ReadP模块。它支持Data.Char中的readLitChar :: ReadS Char函数，它可以满足我的需求，但我不知道如何使用ReadP模块。我尝试了以下内容并且有效：

escape2 [] = []
escape2 xs = case readLitChar xs of
    [] -> []
    [(a, b)] -> a : escape2 b

但这可能不是使用ReadP模块的正确方法。任何人都可以提供一些指示吗？

另一次更新：谢谢大家。我的最终功能如下。不错，我想。

import Text.ParserCombinators.ReadP
import Text.Read.Lex

escape xs 
    | []      <- r = []
    | [(a,_)] <- r = a
    where r = readP_to_S (manyTill lexChar eof) xs

Answer 1

你不需要做任何事情。输入字符串文字时

let s = "Newline: \\n Tab: \\t"

你可以检查它是你想要的：

Prelude> putStrLn s
Newline: \n Tab: \t
Prelude> length s
19

如果您只是向ghci询问s的值，您将获得其他内容，

Prelude> s
"Newline: \\n Tab: \\t"

显然它正在你背后做一些转义格式化，它也会显示引号。如果您致电show或print，您将获得其他答案：

Prelude> show s
"\"Newline: \\\\n Tab: \\\\t\""
Prelude> print s
"Newline: \\n Tab: \\t"

这是因为show用于序列化值，所以当你show一个字符串时你没有得到原始的字符串，你会得到一个序列化的字符串，可以将其解析为原始字符串。 show s的结果实际显示为print s（print定义为putStrLn . show）。当你在ghci中show s时，你会得到一个更奇怪的答案;这里ghci正在格式化由show序列化的字符。

tl; dr - 总是使用putStrLn来查看字符串在ghci中的值。

编辑：我刚刚意识到您可能想要转换文字值

Newline: \n Tab: \t

进入实际的控制序列。最简单的方法是将其粘贴在引号中并使用read：

Prelude> let s' = '"' : s ++ "\""
Prelude> read s' :: String
"Newline: \n Tab: \t"
Prelude> putStrLn (read s')
Newline: 
 Tab:

编辑2 ：使用readLitChar的示例，除了readLitChar之外，这与Chris的答案非常接近：

strParser :: ReadP String
strParser = do
  str <- many (readS_to_P readLitChar)
  eof
  return str

然后用readP_to_S运行它，它会给你一个匹配的解析列表（不应该有多个匹配，但是可能没有任何匹配，所以你应该检查一个空列表。）

> putStrLn . fst . head $ readP_to_S strParser s
Newline:
Tab:    
>

Answer 2

询问QQ和TH意味着您希望在编译时进行此转换。对于简单的字符串 - ＆gt;您可以使用GHC中的OverloadedString文字工具进行转换。

编辑2 ：在Text.Read.Lex中使用公开的字符词法分析器

module UnEscape where

import Data.String(IsString(fromString))
import Text.ParserCombinators.ReadP as P
import Text.Read.Lex as L

newtype UnEscape = UnEscape { unEscape :: String }

instance IsString UnEscape where
  fromString rawString = UnEscape lexed
    where lexer = do s <- P.many L.lexChar
                     eof
                     return s
          lexed = case P.readP_to_S lexer rawString of
                    ((answer,""):_) -> answer
                    _ -> error ("UnEscape could not process "++show rawString)

编辑1 ：我现在有一个更好的UnEscape实例，它使用GHC的读取：

instance IsString UnEscape where
  fromString rawString = UnEscape (read (quote rawString))
    where quote s = '"' : s ++ ['"']

例如：

module UnEscape where

import Data.String(IsString(fromString))

newtype UnEscape = UnEscape { unEscape :: String }

instance IsString UnEscape where
  fromString rawString = UnEscape (transform rawString)
    where transform [] = []
          transform ('\\':x:rest) = replace x : transform rest
          transform (y:rest) = y : transform rest
            -- also covers special case of backslash at end
          replace x = case x of
                        'n' -> '\n'
                        't' -> '\t'
                        unrecognized -> unrecognized

以上必须是与使用unEscape的模块分开的模块：

{-# LANGUAGE OverloadedStrings #-}
module Main where

import UnEscape(UnEscape(unEscape))

main = do
  let s = "Newline: \\n Tab: \\t"
      t = unEscape "Newline: \\n Tab: \\t"
  print s
  putStrLn s
  print t
  putStrLn t

这会产生

shell prompt$ ghci Main.hs 


GHCi, version 7.0.3: http://www.haskell.org/ghc/  :? for help
Loading package ghc-prim ... linking ... done.
Loading package integer-gmp ... linking ... done.
Loading package base ... linking ... done.
Loading package ffi-1.0 ... linking ... done.
[1 of 2] Compiling UnEscape         ( UnEscape.hs, interpreted )
[2 of 2] Compiling Main             ( Main.hs, interpreted )
Ok, modules loaded: Main, UnEscape.
*Main> main
"Newline: \\n Tab: \\t"
Newline: \n Tab: \t
"Newline: \n Tab: \t"
Newline: 
 Tab:

Haskell：如何将“\\ 0”变成“\ 0”？

2 个答案: