Data.ByteString和Data.ByteString.Char8之间的区别

时间:2017-11-23 02:19:12

标签: haskell bytestring

我读到Char8只支持ASCII字符,如果你使用其他Unicode字符

将会很危险
{-# LANGUAGE OverloadedStrings #-}

--import qualified Data.ByteString as B
import qualified Data.ByteString.Char8 as BC
import qualified Data.Text.IO as TIO
import qualified Data.Text.Encoding as E
import qualified Data.Text as T

name :: T.Text
name = "{ \"name\": \"哈时刻\" }"

nameB :: BC.ByteString
nameB = E.encodeUtf8 name

main :: IO ()
main = do
  BC.writeFile "test.json" nameB
  putStrLn "done"

产生与

相同的结果
{-# LANGUAGE OverloadedStrings #-}

import qualified Data.ByteString as B
--import qualified Data.ByteString.Char8 as BC
import qualified Data.Text.IO as TIO
import qualified Data.Text.Encoding as E
import qualified Data.Text as T

name :: T.Text
name = "{ \"name\": \"哈时刻\" }"

nameB :: B.ByteString
nameB = E.encodeUtf8 name

main :: IO ()
main = do
  B.writeFile "test.json" nameB
  putStrLn "done"

那么使用Data.ByteString.Char8Data.ByteString

的区别是什么

1 个答案:

答案 0 :(得分:5)

如果你比较Data.ByteStringData.ByteString.Char8,你会注意到后一个参考Word8中引用Char的一堆函数。

-- Data.ByteString
map :: (Word8 -> Word8) -> ByteString -> ByteString
cons :: Word8 -> ByteString -> ByteString
snoc :: ByteString -> Word8 -> ByteString
head :: ByteString -> Word8
uncons :: ByteString -> Maybe (Word8, ByteString) 
{- and so on... -}


-- Data.ByteString.Char8
map :: (Char -> Char) -> ByteString -> ByteString
cons :: Char -> ByteString -> ByteString
snoc :: ByteString -> Char -> ByteString
head :: ByteString -> Char
uncons :: ByteString -> Maybe (Char, ByteString) 
{- and so on... -}

对于这些功能以及这些功能,Data.ByteString.Char8提供了方便,无需不断地将Word8值转换为Char个值。 writeFile在两个模块中完全相同。

这是查看TextByteStringByteString.Char8中类似函数的不同行为的好方法:

{-# LANGUAGE OverloadedStrings #-}

import Data.Text.Encoding

import qualified Data.Text as T
import qualified Data.ByteString as B
import qualified Data.ByteString.Char8 as BC

nameText :: T.Text
nameText = "哈时刻"

nameByteString :: B.ByteString
nameByteString = encodeUtf8 nameText

main :: IO ()
main = do
  print $ T.head nameText               -- '\21704'     actual first character
  print $ B.head nameByteString         -- 229          first byte
  print $ BC.head nameByteString        -- '\299'       first byte as character

  putStrLn [ T.head nameText ]          -- 哈           actual first character
  putStrLn [ BC.head nameByteString ]   -- å            first byte as character