Question

考虑到引擎盖下有任意精度的数学运算，我正在尝试获取一个大整数的字节大小，并且我想知道结果实际使用了多少空间。

这里是一个示例：

Prelude> import Data.Bits
Prelude> let fac1000 = product [1..1000] # big!
Prelude Data.Bits> finiteBitSize fac1000

<interactive>:37:1: error:
    • Ambiguous type variable ‘b0’ arising from a use of ‘finiteBitSize’
      prevents the constraint ‘(FiniteBits b0)’ from being solved.
      Probable fix: use a type annotation to specify what ‘b0’ should be.
      These potential instances exist:
        instance FiniteBits Bool -- Defined in ‘Data.Bits’
        instance FiniteBits Int -- Defined in ‘Data.Bits’
        instance FiniteBits Word -- Defined in ‘Data.Bits’
    • In the expression: finiteBitSize fac1000
      In an equation for ‘it’: it = finiteBitSize fac1000

<interactive>:37:15: error:
    • Ambiguous type variable ‘b0’ arising from a use of ‘fac1000’
      prevents the constraint ‘(Num b0)’ from being solved.
      Probable fix: use a type annotation to specify what ‘b0’ should be.
      These potential instances exist:
        instance Num Integer -- Defined in ‘GHC.Num’
        instance Num Double -- Defined in ‘GHC.Float’
        instance Num Float -- Defined in ‘GHC.Float’
        ...plus two others
        (use -fprint-potential-instances to see them all)
    • In the first argument of ‘finiteBitSize’, namely ‘fac1000’
      In the expression: finiteBitSize fac1000
      In an equation for ‘it’: it = finiteBitSize fac1000

例如，当我强制转换为整数时，它们建议的类型注释似乎不合理：

Prelude Data.Bits> finiteBitSize (fac1000 :: Int)
64

嗯，这是一个很大的数字，我不相信。在python中，我得到了：

>>> import sys, math
>>> sys.getsizeof(math.factorial(1000))
1164

对于天文大的4.02e2568来说，这对我来说更可信。

Answer 1

您可以通过使用integer-logarithms包询问其日志基数256来估算字节数：

Math.NumberTheory.Logarithms> integerLogBase 256 (product [1..1000])
1066

这只是一个近似值，因为它仅考虑了用于存储数字的字节。通常，任意精度整数还具有一些存储开销的信息，这些信息用于存储有关数字长度的信息，并且可能会存储一些过度分配的信息，而采用对数表示法则不能解决这些问题。

如果您不介意以比特而不是字节报告近似大小，则integerLog2会更快。

Math.NumberTheory.Logarithms> integerLog2 (product [1..1000])
8529

如果您想得到真正的答案，则必须使用一些非常底层的API并依赖于确切的definition of Integer：

{-# LANGUAGE MagicHash #-}
import Data.Bits
import GHC.Exts
import GHC.Integer.GMP.Internals
import GHC.Prim

sizeOfInteger :: Integer -> Int
sizeOfInteger n = constructorSize + case n of
    S# i -> finiteBitSize (I# i) `div` 8
    Jp# bn -> sizeOfBigNat bn
    Jn# bn -> sizeOfBigNat bn
    where
    constructorSize = finiteBitSize (0 :: Word) `div` 8
    sizeOfBigNat (BN# arr) = constructorSize + I# (sizeofByteArray# arr)

在ghci中试用：

> sizeOfInteger (product [1..1000])
1088

当心！我不知道这些内部API的所有前景。以不同的方式计算相等的Integer可能会产生表示不同的值。当接触这些内部API时，有时会失去抽象外部API的保证。在这种情况下，您可能没有x == y暗示sizeOfInteger x == sizeOfInteger y的意思。如果您打算采用这种方法，请仔细阅读文档！

Answer 2

您误解了finiteBitSize的功能。在文档中，重点是我的：

返回参数类型的位数。 该参数的实际值将被忽略。

finiteBitSize :: FiniteBits b => b -> Int函数告诉您 type b的属性，并使用参数选择哪种类型。任何Int都会给您相同的答案：

ghci> finiteBitSize (0 :: Int)
64
ghci> finiteBitSize (maxBound :: Int)
64
ghci> finiteBitSize (undefined :: Int)
64

这是因为Int是惰性的 machine 整数的类型，它们适合一个单词。确实：

ghci> product [1..1000] :: Int
0

比您期望的要小：-）

如果要将product [1..1000]的大小测量为无界Integer，则需要另一种方法。 Daniel Wagner's answer提供了两种很好的方法，既有数学方法（如何计算log ₂ 100！），也有GHC内部方法。

Haskell getsizeof大整数

2 个答案: