Question

我正在编写一个小型node.js应用程序，它从HTML表单接收多部分POST并将传入数据传递给Amazon S3。 formidable模块提供multipart parsing，将每个部分公开为节点Stream。 knox模块将PUT处理为s3。

var form = new formidable.IncomingForm()
 ,  s3   = knox.createClient(conf);

form.onPart = function(part) {
    var put = s3.putStream(part, filename, headers, handleResponse);
    put.on('progress', handleProgress);
};

form.parse(req);

我通过socket.io向浏览器客户端报告上传进度，但很难获得这些数字以反映节点上传到s3的实际进度。

当浏览器到节点上传瞬间发生时，就像节点进程在本地网络上运行时一样，进度指示器立即达到100％。如果文件很大，即300MB，则进度指示器缓慢上升，但仍然比我们的上游带宽允许的速度快。在达到100％进度后，客户端会挂起，大概等待s3上传完成。

我知道putStream在内部使用Node的stream.pipe方法，但我不明白这是如何工作的细节。我的假设是节点尽可能快地吞噬输入数据，将其丢入内存。如果写入流可以足够快地获取数据，则很少有数据一次保存在存储器中，因为它可以被写入和丢弃。如果写入流速度很慢，就像在这里一样，我们可能必须将所有传入数据保留在内存中，直到可以写入为止。由于我们正在监听读取流上的data个事件以便取得进展，因此我们最终报告的上传速度比实际情况要快。

我对这个问题的理解是否接近标记？我该怎么办呢？我是否需要对write，drain和pause感到沮丧？

Answer 1

你的问题是stream.pause isn't implemented on the part，这是多部分表单解析器输出的一个非常简单的读取流。

Knox instructs the s3 request to emit "progress" events whenever the part emits "data"。但是，由于part流忽略暂停，因此会在上载和解析表单数据时尽快发出进度事件。

然而，强大的form确实知道如何pause和resume（代理对其解析的请求的调用）。

这样的事情可以解决你的问题：

form.onPart = function(part) {

    // once pause is implemented, the part will be able to throttle the speed
    // of the incoming request
    part.pause = function() {
      form.pause();
    };

    // resume is the counterpart to pause, and will fire after the `put` emits
    // "drain", letting us know that it's ok to start emitting "data" again
    part.resume = function() {
      form.resume();
    };

    var put = s3.putStream(part, filename, headers, handleResponse);
    put.on('progress', handleProgress);
};

从node.js报告上载进度

1 个答案: