我有一个代码,它使用文件句柄来模拟来自源(Bytestring
)的流AWS S3
的接收器。如果我们想使用Network.Websocket
作为接收器,那么将下面的代码中的LBS.writeFile
与sendBinaryData
(带连接句柄)交换就足够了吗?
{-# LANGUAGE OverloadedStrings,ScopedTypeVariables #-}
import qualified Aws
import qualified Aws.S3 as S3
import Data.Conduit (($$+-))
import qualified Data.Conduit.List as CL (mapM_)
import qualified Data.ByteString.Streaming.HTTP as SP
import qualified Data.ByteString.Lazy as LBS
import Streaming as S
import Streaming.Prelude as S hiding (show,print)
import Control.Concurrent.Async (async,waitCatch)
import Data.Text as T (Text)
data AwsConfig a = AwsConfig { _aws_cfg :: Aws.Configuration, _aws_s3cfg :: S3.S3Configuration a, _aws_httpmgr :: SP.Manager }
getObject :: AwsConfig Aws.NormalQuery -> T.Text -> T.Text -> IO Int
getObject cfg bucket key = do
req <- waitCatch =<< async (runResourceT $ do
{- Create a request object with S3.getObject and run the request with pureAws. -}
S3.GetObjectResponse { S3.gorResponse = rsp, S3.gorMetadata = mdata } <-
Aws.pureAws (_aws_cfg cfg) (_aws_s3cfg cfg) (_aws_httpmgr cfg) $
S3.getObject bucket key
{- Stream the response to a lazy bytestring -}
liftIO $ LBS.writeFile "testaws" LBS.empty -- this will be replaced by content-length of the bytes
let obj = (($$+- CL.mapM_ S.yield) . hoist lift ) (SP.responseBody rsp)
S.mapM_ (liftIO . (LBS.appendFile "testaws") . LBS.fromStrict) obj
return $ lookup "content-length" (S3.omUserMetadata mdata))
case req of
Left _ -> return 2 -- perhaps, we could use this to send an error message over websocket
Right _ -> return 0
对我而言,混淆的根源是如何确定流的终止?如果是文件,则由writeFile
API处理。那么sendBinaryData
呢?它是否以与writeFile
类似的方式处理终止?或者它是由客户端的数据解析器决定的?
更新
这个问题是关于如何将数据流式传输到websocket句柄(让我们假设已经提供了句柄),就像我们在上面的示例中对文件句柄一样,而不是真正关于如何管理{{1 }}。 resourceT
似乎确实采用mapM_
方法来接收数据。所以,似乎确实是要走的路。
终止问题是因为我有这样的想法:如果我们有一个函数侦听Websocket句柄另一端的数据,那么确定消息的结尾似乎在流媒体中很重要上下文。给定如下函数:
conduit
如果我们f :: LBS.ByteString -> a
将数据流式传输到websocket句柄,是否需要添加某种S.mapM_
标记,以便另一方的end of stream
监听可以停止处理懒惰的字节串。否则f
将不知道消息何时完成。
答案 0 :(得分:2)
你认为句柄需要额外的技巧是正确的。但是,由于您已经在使用ResourceT
monad转换器,因此delightfully simple to do with allocate
。 allocate
允许您在资源monad中创建一个句柄并注册一个清理操作(在您的情况下只是关闭连接)。
ok <- runResourceT $ do
(releaseKey, handle) <-
allocate (WebSockets.acceptRequest request)
(`WebSockets.sendClose` closeMessage)
WebSockets.sendBinaryData handle data
return ok
where
request = ...
closeMessage = ...
data = ...
ok = ...
使用allocate
,保证句柄在runResourceT
返回ok
时关闭。
getObject
不应该知道如何接受和关闭WS连接;也许它应该将WS连接句柄作为参数然后写入它。如果将其返回类型升级为ResourceT
,那么您可以将调用者收费到getObject
,并负责调用runResourceT
并分配WS句柄等。但希望上面的例子足以让你继续前进。
答案 1 :(得分:1)
(警告 - 未经过测试的代码。)
您的代码重新打开输出文件,并在每次数据包进入时附加到它。显然,更好的解决方案是使用LBS.hPutStr
使用已打开的文件句柄写入文件。 / p>
即代替:
S.mapM_ (liftIO . (LBS.appendFile "testaws") . LBS.fromStrict) obj
你想用:
S.mapM_ (liftIO . (LBS.hPutStr h) . LBS.fromStrict) obj
当然,这会引用句柄h
,它来自何处?
一种解决方案是将其传递给getObject
或在调用getObject
的主体之前创建它,例如:
getObject cfg bucket key = withFile "output" $ \h -> do
req <- ...
...
S.mapM_ (liftIO . (LBS.hPutStr h) . LBS.fromStrict) obj
...
或许你必须在runResourceT里面创建......我不确定。
更新 - 请参阅@ haoformayor的答案,了解如何让ResourceT为您管理文件句柄。
答案 2 :(得分:1)
这里有一些可能使事情更容易理解的点点滴滴。首先,对于第一个小型演示,修改$ans
,我使用getObject
,无论如何都在Streaming.ByteString.writeFile
,以懒惰的字节字符串来绕道而行。
ResourceT
我们可以使用{-# LANGUAGE OverloadedStrings,ScopedTypeVariables #-}
import qualified Aws
import qualified Aws.S3 as S3
import Data.Conduit
import qualified Data.Conduit.List as CL (mapM_)
import qualified Data.ByteString.Streaming.HTTP as HTTP
import qualified Data.ByteString.Streaming as SB
import qualified Data.ByteString.Streaming.Internal as SB
import qualified Data.ByteString as B
import qualified Data.ByteString.Lazy as BL
import Streaming as S
import Streaming.Prelude as S hiding (show,print)
import Control.Concurrent.Async (async,waitCatch)
import Data.Text as T (Text)
import qualified Network.WebSockets as WebSockets
import Control.Monad.Trans.Resource
data AwsConfig a = AwsConfig { _aws_cfg :: Aws.Configuration
, _aws_s3cfg :: S3.S3Configuration a
, _aws_httpmgr :: HTTP.Manager }
getObject :: AwsConfig Aws.NormalQuery -> FilePath -> T.Text -> T.Text -> IO Int
getObject cfg file bucket key = do
req <- waitCatch =<< async (runResourceT $ do
S3.GetObjectResponse { S3.gorResponse = rsp, S3.gorMetadata = mdata } <-
Aws.pureAws (_aws_cfg cfg) (_aws_s3cfg cfg) (_aws_httpmgr cfg) $
S3.getObject bucket key
let bytestream = do
-- lookup "content-length" (S3.omUserMetadata mdata))
SB.chunk B.empty -- this will be replaced by content-length
hoist lift (HTTP.responseBody rsp) $$+- CL.mapM_ SB.chunk
SB.writeFile file bytestream ) -- this is in ResourceT
case req of
Left _ -> return 2
Right _ -> return 0
或
SB.writeFile
现在,我们需要一个不包含在流式字节串库中的小助手
getObjectAbstracted
:: (SB.ByteString (ResourceT IO) () -> ResourceT IO b)
-> AwsConfig Aws.NormalQuery -> S3.Bucket -> Text -> ResourceT IO b
getObjectAbstracted action cfg bucket key = do
S3.GetObjectResponse { S3.gorResponse = rsp, S3.gorMetadata = mdata } <-
Aws.pureAws (_aws_cfg cfg)
(_aws_s3cfg cfg)
(_aws_httpmgr cfg)
(S3.getObject bucket key)
action (hoist lift (HTTP.responseBody rsp) $$+- CL.mapM_ SB.chunk)
并且可以使用streaming bytestring
或多或少地进行@haoformayor计划mapMChunks_ :: Monad m => (B.ByteString -> m ()) -> SB.ByteString m r -> m r
mapMChunks_ act bytestream = do
(a S.:> r) <- SB.foldlChunksM (\_ bs -> act bs) (return ()) bytestream
return r
当然,到目前为止,这一切都没有使用writeConnection :: MonadIO m => WebSockets.Connection -> SB.ByteString m r -> m r
writeConnection connection =
mapMChunks_ (liftIO . WebSockets.sendBinaryData connection)
-- following `haoformayor`
connectWrite
:: (MonadResource m, WebSockets.WebSocketsData a)
=> WebSockets.PendingConnection
-> a -- closing message
-> SB.ByteString m r -- stream from aws
-> m r
connectWrite request closeMessage bytestream = do
(releaseKey, connection) <- allocate (WebSockets.acceptRequest request)
(`WebSockets.sendClose` closeMessage)
writeConnection connection bytestream
getObjectWS :: WebSockets.WebSocketsData a =>
WebSockets.PendingConnection
-> a
-> AwsConfig Aws.NormalQuery
-> S3.Bucket
-> Text
-> ResourceT IO ()
getObjectWS request closeMessage = getObjectAbstracted (connectWrite request closeMessage)
和conduit
/ streaming
之间的区别。