Question

tl; dr：如果流处理中有多个步骤（A-> B-> C），并且流的开始（A-> B）能够比输出流更快地提取数据能够将其发送出去（B-> C），这会导致内存消耗增加吗？

我仍然是Node stream的初学者，想了解一下我的理解...

我正在构建一个工具，该工具将从S3中提取照片，将其流式传输为zip存档，然后将其流式传输为S3上传。我之所以使用流，是因为我认为可以快速移动数据而不会占用太多内存。

但是我发现，随着脚本运行并提取越来越多的照片，内存确实在稳步增长（我们谈论的是1000张照片的数量）。

我的问题是这是否是由于系统中的背压问题引起的，例如上传流发送数据的速度不如下载流传输到zip流的速度快。

为清楚起见，这里的代码省略了许多错误处理和日志记录。

const AWS = require('aws-sdk')
const s3Zip = require('s3-zip')

exports.handler = function (event, context) {
  const { region, bucket, folder, files, zipFileName } = event
  /**
   * s3Zip.archive is a library call that reads multiple S3 objects using #getObject().createReadStream() and pipes
   * them into a zip using the `archiver` library
   */
  const body = s3Zip.archive({ region, bucket }, folder, files)
  /**
   * s3.upload takes a stream Body and uploads it to s3
   * https://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/S3.html#upload-property
   */
  const upload = s3.upload({ Bucket: bucket, Key: zipFileName, Body: body })
  upload.send((err, result) => { /* error handling or trigger a success */ })
}

节点流中的内存累积

0 个答案: