构造函数名称的长度是否有限制?拥有荒谬的构造函数名称有什么后果?
data
答案 0 :(得分:28)
如果我们检查source for ghc,我们可以找到用于定义数据构造函数的类型。 It is named DataCon,它有以下字段:
dcName :: Name, -- This is the name of the *source data con*
走下兔子洞,Name contains an OccName:
n_occ :: !OccName, -- Its occurrence name
OccName包含FastString
名称:
data OccName = OccName
{ occNameSpace :: !NameSpace
, occNameFS :: !FastString
}
deriving Typeable
最后,FastString只是一个ByteString
,也有一个预先计算的长度,还有一个int来标记它以便快速比较:
data FastString = FastString {
uniq :: {-# UNPACK #-} !Int, -- unique id
n_chars :: {-# UNPACK #-} !Int, -- number of chars
fs_bs :: {-# UNPACK #-} !ByteString,
fs_ref :: {-# UNPACK #-} !(IORef (Maybe FastZString))
} deriving Typeable
使用此数据类型的字符串大小没有限制(显然maxBound :: Int
除外)。但是,这并不排除代码中可能导致问题的其他地方的错误。
所以我们需要一个程序来测试它:
{-# LANGUAGE BangPatterns #-}
{-# LANGUAGE TemplateHaskell #-}
module Main where
import Control.Applicative ((<$>))
import Control.Monad (forM_)
import System.IO (hPutStr, hFileSize, hClose)
import System.Exit (ExitCode(..))
import System.IO.Temp (withSystemTempFile)
import Data.Time.Clock.POSIX (getPOSIXTime)
import System.Process (readProcessWithExitCode)
-- timing functions (from criterion)
getTime :: IO Double
getTime = (fromRational . toRational) `fmap` getPOSIXTime
time :: IO a -> IO (Double, a)
time act = do
start <- getTime
result <- act
end <- getTime
let !delta = end - start
return (delta, result)
-- make a constructor like
-- data C = FFFFFF
makeConstructor :: Int -> String
makeConstructor size = "data C = " ++ replicate size 'F'
wrapWithMainModule :: String -> String
wrapWithMainModule code = unlines ["module Main where", "main = return ()", code]
data CompileResults = CompileResults {
timeTaken :: Double,
success :: Bool,
outputFileSize :: Integer
} deriving (Show)
compileHsCode :: String -> IO CompileResults
compileHsCode sourceCode = withSystemTempFile "test.hs" $ \path handle -> do
withSystemTempFile "output.o" $ \outputPath outputHandle -> do
hPutStr handle $ wrapWithMainModule sourceCode
hClose handle
(timeTaken, (exitCode, _, _)) <- time $ readProcessWithExitCode "ghc" ["-c", "-o", outputPath, path] ""
let success = exitCode == ExitSuccess
size <- if success then hFileSize outputHandle else return 0
return $ CompileResults {
timeTaken = timeTaken
, success = success
, outputFileSize = size
}
testConstructorSizes :: [Int] -> IO ()
testConstructorSizes sizes = forM_ sizes $ \size -> do
info <- compileHsCode $ makeConstructor size
putStrLn $ "For Size " ++ show size ++ "\t: " ++ show info
-- Up to 10 million
sizesToTest :: [Int]
sizesToTest = take 7 (iterate (*10) 10)
main = testConstructorSizes $ sizesToTest
以下是运行main
:
For Size 10 : CompileResults {timeTaken = 0.1390078067779541, success = True, outputFileSize = 1818}
For Size 100 : CompileResults {timeTaken = 0.14700841903686523, success = True, outputFileSize = 2086}
For Size 1000 : CompileResults {timeTaken = 0.1390080451965332, success = True, outputFileSize = 4786}
For Size 10000 : CompileResults {timeTaken = 0.1520085334777832, success = True, outputFileSize = 31786}
For Size 100000 : CompileResults {timeTaken = 0.31201791763305664, success = True, outputFileSize = 301786}
For Size 1000000 : CompileResults {timeTaken = 2.26712965965271, success = True, outputFileSize = 3001786}
For Size 10000000 : CompileResults {timeTaken = 109.2182469367981, success = True, outputFileSize = 30001786}
几点有趣:
(1786 + (constructorSize * 3)
。因此,当在构造函数中使用时,每个char占用三个字节。