Question

这个问题结尾处的测试程序应该缓冲PassThrough流对象中的数据，然后报告一旦它被全部缓冲后有多少。（这取决于一个真正的程序，该程序应该根据设置缓冲PassThrough或gzip流对象中的数据，然后将其提供给希望可读的库流，所有数据已经可用。）

它适用于小“blob”，例如

$ node --version
v8.11.1

$ node test.js 128; echo $?
s.write called
done writing
s.write called
s.final called
s.final: 128 bytes written
run complete
42

但是如果在s.write调用中传递的数据足够大，则不会调用final挂钩：相反，解释器在调用write挂钩两次后静默退出，它认为没有更多任务可以运行的方式。

$ node wtf.js 16385; echo $?
s.write called
done writing
s.write called
0

（在我的计算机上，截止时间正好是16385; 16384或更少的字节正常工作。我认为这是一些内部大小限制，它可能并不总是相同。）

writable.write documentation让我相信暂停的变换流应该愿意缓冲任意大量的数据。是什么赋予了？如何可靠地完成这项工作？（在回答时，请注意，在实际程序中，PassThrough流可能改为zlib.createGzip流。）

"use strict";

const stream  = require("stream");

let n = parseInt(process.argv[2], 10);
if (!Number.isFinite(n) || n <= 0) {
    console.error(`usage: ${process.argv[1]} nbytes`);
    process.exit(1);
}
let blob = "x".repeat(n-1);

let shim = new stream.PassThrough();
let strm = new stream.Writable({
    write(d, e, c) {
        console.log("s.write called");
        shim.write(d, e, c);
    },
    final(c)       {
        console.log("s.final called");
        shim.end();
        let buf = shim.read();
        console.log(`s.final: ${buf.length} bytes written`);
        c();
    }
});

function run(s) {
    return new Promise((res, rej) => {
        s.on("finish", res);
        s.on("error", rej);
        s.write(blob);
        s.end("\n");
        console.log("done writing");
    }).then(() => {
        console.log("run complete");
        return 42;
    }, (e) => {
        console.log("write error");
        console.error(e);
        return 19;
    });
}

run(strm).then(process.exit);

Answer 1

我没有完整的解决方案，但我有一些（大）的线索，希望能让你顺利上路。

首先，16384是可写流使用的缓冲区大小的默认highWaterMark - 这是记录的。除此之外的任何写入（在非排空流中）开始返回false作为信号到源以停止发送数据以进行写入。当然，源可以自由地忽略这个信号并继续在流中转储数据（正如你已经完成的那样）。正如您所正确观察到的那样，Node将（并且确实）继续缓冲写入的块，直到内存耗尽并崩溃。但是当你调用end（）时，如果流缓冲区中还有未经训练的数据，它将不会正常结束 - 没有完成事件/ _final（）调用。

那么小（又名<16384字节）blob会发生什么？你的可写流能够完全流失到passThrough（它本身永远不会耗尽，但这是另一个故事）。所以它调用_final（），发出结束并优雅地结束。

使用＆gt; 16384字节blob，在第一次写入16384字节后，passThrough自己的highWaterMark被破坏。虽然就外部writableStream而言，这种写入确实会消耗掉，但下一次写入＆＃34; \ n＆＃34;从结尾（）没有。所以当你end()并且它必须强制关闭时，writableStream仍然在缓冲区中有数据。没有完成，没有_最后的电话。

您可以尝试一些有趣的实验（一次一个）：

a）增加passThrough的highWaterMark。将其设置为＆gt; 16384，您应该能够将更多数据推送到writableStream。

b）不要将您在writableStream的_write（）中的回调参数c直接传递给shim.write（），而是在调用shim.write（）后自己调用回调函数

shim.write(d,e);
c();

这将向writableStream发出信号，表示无论目标是否实际消耗数据，您都会将其耗尽。丑陋，但它确实有效。

c）删除＆＃34; \ n＆＃34;从结束（）

缓冲内存中的数据 - 从不调用Writable.final（）挂钩

1 个答案: