使用sourceFile
,我们得到一个ByteString流。
参考我的另一个问题"Combining multiple Sources/Producers into one",我可以使用ZipSink
,sourceFile
以及产生无限流的自定义源获取(StdGen,ByteString)的来源StdGen。
我想要实现的是将每个StdGen与ByteString的一个字节配对,但是对于我当前的实现,我得到一个StdGen与来自sourceFile
的输入文件的整个内容配对。
我已查看Conduit.Binary
的{{1}}函数,但在使用时,它似乎对我不起作用:
isolate
在Conduit术语中,我认为{-# LANGUAGE NoImplicitPrelude #-}
{-# LANGUAGE RankNTypes #-}
{-# LANGUAGE OverloadedStrings #-}
import System.Random (StdGen(..), split, newStdGen, randomR)
import ClassyPrelude.Conduit as Prelude
import Control.Monad.Trans.Resource (runResourceT, ResourceT(..))
import qualified Data.ByteString as BS
import Data.Conduit.Binary (isolate)
-- generate a infinite source of random number seeds
sourceStdGen :: MonadIO m => Source m StdGen
sourceStdGen = do
g <- liftIO newStdGen
loop g
where loop gin = do
let g' = fst (split gin)
yield gin
loop g'
-- combine the sources into one
sourceInput :: (MonadResource m, MonadIO m) => FilePath -> Source m (StdGen, ByteString)
sourceInput fp = getZipSource $ (,)
<$> ZipSource sourceStdGen
<*> ZipSource (sourceFile fp $= isolate 1)
-- a simple conduit, which generates a random number from provide StdGen
-- and append the byte value to the provided ByteString
simpleConduit :: Conduit (StdGen, ByteString) (ResourceT IO) ByteString
simpleConduit = mapC process
process :: (StdGen, ByteString) -> ByteString
process (g, bs) =
let rnd = fst $ randomR (40,50) g
in bs ++ pack [rnd]
main :: IO ()
main = do
runResourceT $ sourceInput "test.txt" $$ simpleConduit =$ sinkFile "output.txt"
将执行isolate
,生成传入的ByteString流的await
,并head
其余的(将其放回到传入流的队列)。基本上,我要做的是将传入的ByteString流切换成字节块。
我正确使用它吗?如果leftOver
不是我应该使用的函数,那么任何人都可以提供另一个将它分成任意字节块的函数吗?
答案 0 :(得分:2)
如果我理解正确,你需要这样的东西:
import System.Random (StdGen, split, newStdGen, randomR)
import qualified Data.ByteString as BS
import Data.Conduit
import Data.ByteString (ByteString, pack, unpack, singleton)
import Control.Monad.Trans (MonadIO (..))
import Data.List (unfoldr)
import qualified Data.Conduit.List as L
import Data.Monoid ((<>))
input :: MonadIO m => FilePath -> Source m (StdGen, ByteString)
input path = do
gs <- unfoldr (Just . split) `fmap` liftIO newStdGen
bs <- (map singleton . unpack) `fmap` liftIO (BS.readFile path)
mapM_ yield (zip gs bs)
output :: Monad m => Sink (StdGen, ByteString) m ByteString
output = L.foldMap (\(g, bs) -> let rnd = fst $ randomR (97,122) g in bs <> pack [rnd])
main :: IO ()
main = (input "in.txt" $$ output) >>= BS.writeFile "out.txt"
省略map singleton
可能更有效,您也可以直接使用Word8
并最后转换回ByteString
。
答案 1 :(得分:0)
我设法自己写一个管道(condWord
),它将传入的ByteString拆分成Word8块。我不确定我是否在这里重新发明轮子。
为了达到预期的行为,我只需将condWord
加到sourceFile
上。
{-# LANGUAGE NoImplicitPrelude #-}
{-# LANGUAGE RankNTypes #-}
{-# LANGUAGE OverloadedStrings #-}
import System.Random (StdGen(..), split, newStdGen, randomR)
import ClassyPrelude.Conduit as Prelude
import Control.Monad.Trans.Resource (runResourceT, ResourceT(..))
import qualified Data.ByteString as BS
import Data.Conduit.Binary (isolate)
import Data.Maybe (fromJust)
-- generate a infinite source of random number seeds
sourceStdGen :: MonadIO m => Source m StdGen
sourceStdGen = do
g <- liftIO newStdGen
loop g
where loop gin = do
let g' = fst (split gin)
yield gin
loop g'
-- combine the sources into one
sourceInput :: (MonadResource m, MonadIO m) => FilePath -> Source m (StdGen, Word8)
sourceInput fp = getZipSource $ (,)
<$> ZipSource sourceStdGen
<*> ZipSource (sourceFile fp $= condWord)
-- a simple conduit, which generates a random number from provide StdGen
-- and append the byte value to the provided ByteString
simpleConduit :: Conduit (StdGen, Word8) (ResourceT IO) ByteString
simpleConduit = mapC process
process :: (StdGen, Word8) -> ByteString
process (g, ch) =
let rnd = fst $ randomR (97,122) g
in pack [fromIntegral ch, rnd]
condWord :: (Monad m) => Conduit ByteString m Word8
condWord = do
bs <- await
case bs of
Just bs' -> do
if (null bs')
then return ()
else do
let (h, t) = fromJust $ BS.uncons bs'
yield h
leftover t
condWord
_ -> return ()
main :: IO ()
main = do
runResourceT $ sourceInput "test.txt" $$ simpleConduit =$ sinkFile "output.txt"