Question

我正在使用zlib来压缩文本数据流。文本数据以块为单位，对于每个块，调用deflate()，flush设置为Z_NO_FLUSH。检索完所有块后，调用deflate()并将flush设置为Z_FINISH。

当然，deflate()在每次调用时都不会产生压缩输出。它在内部累积数据以实现高压缩率。那没关系！每次deflate()生成压缩输出时，该输出都会附加到数据库字段 - 这是一个缓慢的过程。

但是，一旦deflate()生成压缩数据，该数据可能不适合提供的输出缓冲区deflate_out。因此，需要多次调用deflate()。这就是我想要避免的：

有没有办法让deflate_out总是足够大，以便deflate()每次决定产生输出时都可以存储所有压缩数据？

注意：

预先知道未压缩数据的总大小。如上所述，未压缩数据以块的形式出现，压缩数据也以块的形式附加到数据库字段中。

在包含文件zconf.h中，我找到了以下评论。那也许是我在找什么？即是(1 << (windowBits+2)) + (1 << (memLevel+9)) deflate()可能生成的压缩数据的最大大小（<1}}？

/* The memory requirements for deflate are (in bytes):
            (1 << (windowBits+2)) +  (1 << (memLevel+9))
 that is: 128K for windowBits=15  +  128K for memLevel = 8  (default values)
 plus a few kilobytes for small objects. For example, if you want to reduce
 the default memory requirements from 256K to 128K, compile with
     make CFLAGS="-O -DMAX_WBITS=14 -DMAX_MEM_LEVEL=7"
 Of course this will generally degrade compression (there's no free lunch).

   The memory requirements for inflate are (in bytes) 1 << windowBits
 that is, 32K for windowBits=15 (default value) plus a few kilobytes
 for small objects.
*/

Answer 1

deflateBound（）仅在单步执行所有压缩时有用，或者强制deflate压缩当前可用的所有输入数据并为所有输入发出压缩数据。您可以使用刷新参数（如Z_BLOCK，Z_PARTIAL_FLUSH等）来执行此操作。

如果你想使用Z_NO_FLUSH，那么尝试预测下一次调用时可能发出的最大输出deflate（）会变得更加困难和低效。您不知道在最后一次压缩数据突发时消耗了多少输入，因此您需要假设几乎没有任何输入，缓冲区大小不必要地增加。但是，如果你试图估计最大输出，那么你将会做很多不必要的malloc或reallocs，这是没有充分理由的，效率很低。

没有必要避免调用deflate（）来获得更多输出。如果你只是循环deflate（）直到它没有更多的输出，那么你可以使用malloced一次的固定输出缓冲区。这就是deflate（）和inflate（）接口的设计使用方式。您可以查看http://zlib.net/zlib_how.html，了解如何使用该界面的详细说明。

顺便说一句，在最新版本的zlib（1.2.6）中有一个deflatePending（）函数，可以让你知道deflate（）等待传递的输出量。

Answer 2

在查看提示的来源时，我摔倒了

/* =========================================================================
 * Flush as much pending output as possible. All deflate() output goes
 * through this function so some applications may wish to modify it
 * to avoid allocating a large strm->next_out buffer and copying into it.
 * (See also read_buf()).
 */
local void flush_pending(strm)
    z_streamp strm;
{
    unsigned len = strm->state->pending;
...

在deflate（）中跟踪void flush_pending（）的使用表明，流中间所需输出缓冲区的上限是

strm->state->pending + deflateBound(strm, strm->avail_in)

第一部分考虑了之前调用deflate（）的管道中的数据，第二部分考虑了尚未处理的长度为avail_in的数据。

zlib，deflate：要分配多少内存？

2 个答案: