将字节流式传输到网络websocket

时间:2016-06-14 13:13:47

标签: haskell haskell-pipes

我有一个代码,它使用文件句柄来模拟来自源(Bytestring)的流AWS S3的接收器。如果我们想使用Network.Websocket作为接收器,那么将下面的代码中的LBS.writeFilesendBinaryData(带连接句柄)交换就足够了吗?

{-# LANGUAGE OverloadedStrings,ScopedTypeVariables #-}

import qualified Aws
import qualified Aws.S3 as S3
import           Data.Conduit (($$+-))
import qualified Data.Conduit.List as CL (mapM_)
import qualified Data.ByteString.Streaming.HTTP as SP
import qualified Data.ByteString.Lazy as LBS
import Streaming as S
import Streaming.Prelude as S hiding (show,print)
import Control.Concurrent.Async (async,waitCatch)
import Data.Text as T (Text)

data AwsConfig a = AwsConfig { _aws_cfg :: Aws.Configuration, _aws_s3cfg :: S3.S3Configuration a, _aws_httpmgr :: SP.Manager }

getObject :: AwsConfig Aws.NormalQuery -> T.Text -> T.Text ->  IO Int
getObject cfg bucket key = do
  req <- waitCatch =<< async (runResourceT $ do
    {- Create a request object with S3.getObject and run the request with pureAws. -}
    S3.GetObjectResponse { S3.gorResponse = rsp, S3.gorMetadata = mdata } <- 
      Aws.pureAws (_aws_cfg cfg) (_aws_s3cfg cfg) (_aws_httpmgr cfg) $
        S3.getObject bucket key
    {- Stream the response to a lazy bytestring -}
    liftIO $ LBS.writeFile "testaws" LBS.empty -- this will be replaced by content-length of the bytes 
    let obj = (($$+- CL.mapM_ S.yield) . hoist lift ) (SP.responseBody rsp)
    S.mapM_ (liftIO . (LBS.appendFile "testaws") . LBS.fromStrict) obj
    return $ lookup "content-length" (S3.omUserMetadata mdata))
  case req of
    Left _ -> return 2 -- perhaps, we could use this to send an error message over websocket
    Right _ -> return 0

对我而言,混淆的根源是如何确定流的终止?如果是文件,则由writeFile API处理。那么sendBinaryData呢?它是否以与writeFile类似的方式处理终止?或者它是由客户端的数据解析器决定的?

更新

这个问题是关于如何将数据流式传输到websocket句柄(让我们假设已经提供了句柄),就像我们在上面的示例中对文件句柄一样,而不是真正关于如何管理{{1 }}。 resourceT似乎确实采用mapM_方法来接收数据。所以,似乎确实是要走的路。

终止问题是因为我有这样的想法:如果我们有一个函数侦听Websocket句柄另一端的数据,那么确定消息的结尾似乎在流媒体中很重要上下文。给定如下函数:

conduit

如果我们f :: LBS.ByteString -> a 将数据流式传输到websocket句柄,是否需要添加某种S.mapM_标记,以便另一方的end of stream监听可以停止处理懒惰的字节串。否则f将不知道消息何时完成。

3 个答案:

答案 0 :(得分:2)

你认为句柄需要额外的技巧是正确的。但是,由于您已经在使用ResourceT monad转换器,因此delightfully simple to do with allocateallocate允许您在资源monad中创建一个句柄并注册一个清理操作(在您的情况下只是关闭连接)。

ok <- runResourceT $ do
  (releaseKey, handle) <-
    allocate (WebSockets.acceptRequest request) 
             (`WebSockets.sendClose` closeMessage)
  WebSockets.sendBinaryData handle data
  return ok
where
  request = ...
  closeMessage = ...
  data = ...
  ok = ...

使用allocate,保证句柄在runResourceT返回ok时关闭。

但是,我并不完全确定这是你想要的。在我看来,getObject不应该知道如何接受和关闭WS连接;也许它应该将WS连接句柄作为参数然后写入它。如果将其返回类型升级为ResourceT,那么您可以将调用者收费到getObject,并负责调用runResourceT并分配WS句柄等。但希望上面的例子足以让你继续前进。

答案 1 :(得分:1)

(警告 - 未经过测试的代码。)

您的代码重新打开输出文件,并在每次数据包进入时附加到它。显然,更好的解决方案是使用LBS.hPutStr使用已打开的文件句柄写入文件。 / p>

即代替:

S.mapM_ (liftIO . (LBS.appendFile "testaws") . LBS.fromStrict) obj

你想用:

S.mapM_ (liftIO . (LBS.hPutStr h) . LBS.fromStrict) obj

当然,这会引用句柄h,它来自何处?

一种解决方案是将其传递给getObject或在调用getObject的主体之前创建它,例如:

getObject cfg bucket key = withFile "output" $ \h -> do
    req <- ...
    ...
    S.mapM_ (liftIO . (LBS.hPutStr h) . LBS.fromStrict) obj
    ...

或许你必须在runResourceT里面创建......我不确定。

更新 - 请参阅@ haoformayor的答案,了解如何让ResourceT为您管理文件句柄。

答案 2 :(得分:1)

这里有一些可能使事情更容易理解的点点滴滴。首先,对于第一个小型演示,修改$ans,我使用getObject,无论如何都在Streaming.ByteString.writeFile,以懒惰的字节字符串来绕道而行。

ResourceT

我们可以使用{-# LANGUAGE OverloadedStrings,ScopedTypeVariables #-} import qualified Aws import qualified Aws.S3 as S3 import Data.Conduit import qualified Data.Conduit.List as CL (mapM_) import qualified Data.ByteString.Streaming.HTTP as HTTP import qualified Data.ByteString.Streaming as SB import qualified Data.ByteString.Streaming.Internal as SB import qualified Data.ByteString as B import qualified Data.ByteString.Lazy as BL import Streaming as S import Streaming.Prelude as S hiding (show,print) import Control.Concurrent.Async (async,waitCatch) import Data.Text as T (Text) import qualified Network.WebSockets as WebSockets import Control.Monad.Trans.Resource data AwsConfig a = AwsConfig { _aws_cfg :: Aws.Configuration , _aws_s3cfg :: S3.S3Configuration a , _aws_httpmgr :: HTTP.Manager } getObject :: AwsConfig Aws.NormalQuery -> FilePath -> T.Text -> T.Text -> IO Int getObject cfg file bucket key = do req <- waitCatch =<< async (runResourceT $ do S3.GetObjectResponse { S3.gorResponse = rsp, S3.gorMetadata = mdata } <- Aws.pureAws (_aws_cfg cfg) (_aws_s3cfg cfg) (_aws_httpmgr cfg) $ S3.getObject bucket key let bytestream = do -- lookup "content-length" (S3.omUserMetadata mdata)) SB.chunk B.empty -- this will be replaced by content-length hoist lift (HTTP.responseBody rsp) $$+- CL.mapM_ SB.chunk SB.writeFile file bytestream ) -- this is in ResourceT case req of Left _ -> return 2 Right _ -> return 0

来或多或少地从中抽象出来
SB.writeFile

现在,我们需要一个不包含在流式字节串库中的小助手

getObjectAbstracted
      :: (SB.ByteString (ResourceT IO) () -> ResourceT IO b)
         -> AwsConfig Aws.NormalQuery -> S3.Bucket -> Text -> ResourceT IO b
getObjectAbstracted action cfg bucket key = do
    S3.GetObjectResponse { S3.gorResponse = rsp, S3.gorMetadata = mdata } <- 
      Aws.pureAws (_aws_cfg cfg) 
                  (_aws_s3cfg cfg) 
                  (_aws_httpmgr cfg) 
                  (S3.getObject bucket key)

    action (hoist lift (HTTP.responseBody rsp)  $$+- CL.mapM_ SB.chunk) 

并且可以使用streaming bytestring

或多或少地进行@haoformayor计划
mapMChunks_ :: Monad m => (B.ByteString -> m ()) -> SB.ByteString m r -> m r
mapMChunks_ act bytestream = do
  (a S.:> r) <- SB.foldlChunksM (\_ bs -> act bs) (return ()) bytestream
  return r

当然,到目前为止,这一切都没有使用writeConnection :: MonadIO m => WebSockets.Connection -> SB.ByteString m r -> m r writeConnection connection = mapMChunks_ (liftIO . WebSockets.sendBinaryData connection) -- following `haoformayor` connectWrite :: (MonadResource m, WebSockets.WebSocketsData a) => WebSockets.PendingConnection -> a -- closing message -> SB.ByteString m r -- stream from aws -> m r connectWrite request closeMessage bytestream = do (releaseKey, connection) <- allocate (WebSockets.acceptRequest request) (`WebSockets.sendClose` closeMessage) writeConnection connection bytestream getObjectWS :: WebSockets.WebSocketsData a => WebSockets.PendingConnection -> a -> AwsConfig Aws.NormalQuery -> S3.Bucket -> Text -> ResourceT IO () getObjectWS request closeMessage = getObjectAbstracted (connectWrite request closeMessage) conduit / streaming之间的区别。