Question

我正在尝试加载PNG文件，获取未压缩的RGBA字节，然后将它们发送到gzip或zlib包。

pngload包将图像数据作为（StorableArray（Int，Int）Word8）返回，压缩包采用惰性ByteStrings。因此，我正在尝试构建一个（StorableArray（Int，Int）Word8 - > ByteString）函数。

到目前为止，我尝试了以下内容：

import qualified Codec.Image.PNG as PNG
import Control.Monad (mapM)
import Data.Array.Storable (withStorableArray)
import qualified Data.ByteString.Lazy as LB (ByteString, pack, take)
import Data.Word (Word8)
import Foreign (Ptr, peekByteOff)

main = do
    -- Load PNG into "image"...
    bytes <- withStorableArray 
        (PNG.imageData image)
        (bytesFromPointer lengthOfImageData)

bytesFromPointer :: Int -> Ptr Word8 -> IO LB.ByteString
bytesFromPointer count pointer = LB.pack $ 
    mapM (peekByteOff pointer) [0..(count-1)]

这会导致堆栈内存不足，所以很明显我做错了。我可以用Ptr和ForeignPtr尝试更多的东西，但是那里有很多“不安全”的功能。

这里的任何帮助将不胜感激;我很难过。

Answer 1

通常，打包和解包对性能来说不是一个好主意。如果你有一个Ptr和一个以字节为单位的长度，你可以用两种不同的方式生成一个严格的字节串：

像这样：

import qualified Codec.Image.PNG as PNG
import Control.Monad
import Data.Array.Storable (withStorableArray)

import Codec.Compression.GZip

import qualified Data.ByteString.Lazy   as L
import qualified Data.ByteString.Unsafe as S

import Data.Word
import Foreign

-- Pack a Ptr Word8 as a strict bytestring, then box it to a lazy one, very
-- efficiently
bytesFromPointer :: Int -> Ptr Word8 -> IO L.ByteString
bytesFromPointer n ptr = do
    s <- S.unsafePackCStringLen (castPtr ptr, n)
    return $! L.fromChunks [s]

-- Dummies, since they were not provided 
image = undefined
lengthOfImageData = 10^3

-- Load a PNG, and compress it, writing it back to disk
main = do
    bytes <- withStorableArray
        (PNG.imageData image)
        (bytesFromPointer lengthOfImageData)
    L.writeFile "foo" . compress $ bytes

我正在使用O（1）版本，只是从StorableArray重新打包Ptr。您可能希望首先通过“packCStringLen”复制它。

Answer 2

你的“bytesFromPointer”的问题在于你从pngload获取一个压缩表示，StorableArray，并且你希望将它转换为另一个打包表示，一个ByteString，通过一个中间列表。有时懒惰意味着中间列表不会在内存中构建，但这不是这种情况。

“mapM”功能是第一个罪犯。如果您展开mapM (peekByteOff pointer) [0..(count-1)]，则会获得

el0 <- peekByteOff pointer 0
el1 <- peekByteOff pointer 1
el2 <- peekByteOff pointer 2
...
eln <- peekByteOff pointer (count-1)
return [el0,el1,el2,...eln]

因为这些操作都发生在IO monad中，所以它们按顺序执行。这意味着必须在构建列表之前构造输出列表的每个元素，并且懒惰永远不会有机会帮助您。

即使列表是懒惰地构建的，Don Stewart指出“打包”功能仍会破坏你的表现。 “pack”的问题在于它需要知道列表中有多少元素来分配正确的内存量。要查找列表的长度，程序需要遍历到最后。由于计算长度的必要性，列表将需要完全加载才能打包成字节串。

我认为“mapM”和“pack”是代码气味。有时您可以将“mapM”替换为“mapM_”，但在这种情况下，最好使用bytestring创建函数，例如“packCStringLen”。

如何将（StorableArray（Int，Int）Word8）转换为惰性ByteString？

2 个答案: