Question

我们有一个系统，客户端发出HTTP GET请求，系统对后端进行一些处理，压缩结果，然后将其发送给客户端。由于处理可能需要一些时间，因此我们将此作为ZipOutputStream包裹response.getOutputStream()。

但是，当我们在第一个ZipEntry中有非常少量的数据，而第二个条目需要很长时间时，客户端使用的浏览器会超时。我们已经尝试刷新流缓冲区，但是在向流写入至少1000个字节之前，似乎没有响应发送到浏览器。奇怪的是，一旦发送了前1000个字节，后续的刷新似乎工作正常。

我尝试将代码拆解为裸露的例子：

protected void doGet(HttpServletRequest request,
        HttpServletResponse response) throws ServletException, IOException {
    try {
        ZipOutputStream _zos = new ZipOutputStream( response.getOutputStream());
        ZipEntry _ze = null;
        long startTime = System.currentTimeMillis();
        long _lByteCount = 0;

        response.setContentType("application/zip");

        while (_lByteCount < 2000) {
            _ze = new ZipEntry("foo");
            _zos.putNextEntry( _ze );

            //writes 100 bytes and then waits 10 seconds
            _lByteCount += StreamWriter.write( 
                    new ByteArrayInputStream(DataGenerator.getOutput().toByteArray()),
                    _zos );
            System.out.println("Zip: " + _lByteCount + " Time: " + ((System.currentTimeMillis() - startTime) / 1000));

            //trying to flush
            _zos.finish();
            _zos.flush();
            response.flushBuffer();
            response.getOutputStream().flush();
        }
    } catch (Throwable e) {
        e.printStackTrace();
    }
}

我将浏览器超时设置为大约20秒，以便轻松复制。尽管多次写入100个字节，但没有任何内容发送到浏览器并且浏览器超时。如果我扩展浏览器超时，则在写入1000个字节之前不会发送任何内容，然后浏览器会弹出“是否要保存...”对话框。再次，在最初的1000个字节之后，每个加法100个字节发送正常，而不是缓冲到1000个字节的块。

如果我将while条件中的最大字节数设置为200左右，它可以正常工作，只发送200个字节。

我该怎么做才能强制servlet发回真正小的初始数据？

Answer 1

事实证明，底层的Apache / Windows IP堆栈有一个限制，它可以缓冲流中的数据，以提高效率。由于大多数人都存在太多数据的问题，而不是太少数据的问题，这在大多数情况下是正确的。我们最终做的是要求用户请求足够的数据，以便在超时之前达到1000字节限制。很抱歉花了这么长时间才回答这个问题。

Answer 2

我知道这是一个非常非常古老的问题，但是为了记录，我想发布一个答案，应该解决你所遇到的问题。

关键是你想要刷新响应流，而不是zip流。因为ZIP流无法刷新尚未准备好写入的内容。正如您所提到的，您的客户正在超时，因为它没有在预定的时间内收到响应，但一旦收到数据，它就会耐心等待很长时间才能下载文件，因此修复很容易，只要您刷新正确的流。我推荐以下内容：

protected void doGet(HttpServletRequest request,
    HttpServletResponse response) throws ServletException, IOException {
try {
    ZipOutputStream _zos = new ZipOutputStream( response.getOutputStream());
    ZipEntry _ze = null;
    long startTime = System.currentTimeMillis();
    long _lByteCount = 0;

    response.setContentType("application/zip");
    // force an immediate response of the expected content
    // so the client can begin the download process
    response.flushBuffer();

    while (_lByteCount < 2000) {
        _ze = new ZipEntry("foo");
        _zos.putNextEntry( _ze );

        //writes 100 bytes and then waits 10 seconds
        _lByteCount += StreamWriter.write( 
                new ByteArrayInputStream(DataGenerator.getOutput().toByteArray()),
                _zos );
        System.out.println("Zip: " + _lByteCount + " Time: " + ((System.currentTimeMillis() - startTime) / 1000));

        //trying to flush
        _zos.finish();
        _zos.flush();
    }
} catch (Throwable e) {
    e.printStackTrace();
}

现在，这里应该发生的是，标头和响应代码将与响应缓冲区的OutputStream中的任何内容一起提交。这不会关闭流，因此会附加对流的任何其他写入。这样做的缺点是，您无法知道要分配给标头的内容长度。积极的是您立即开始下载，并且不允许浏览器超时。

Answer 3

我的猜测是，压缩输出流实际上并没有写任何东西才能压缩东西。用于压缩的霍夫曼算法要求在实际能够压缩任何东西之前知道所有数据。在基本知道一切之前它无法启动。

如果数据量很大，压缩可能会赢，但我认为在压缩数据时你不能实现异步响应。

Answer 4

你可能会被Java API搞砸了。

查看各种OutputStream类系列（OutputStream，ServletOutputStream，FilterOutputStream和ZipOutputStream）的JavaDoc，他们要么提到它们依赖于底层flush（）的流或者声明flush（）不执行任何操作（OutputStream）。

ZipOutputStream从FilterOutputStream继承flush（）和write（）。

来自FilterOutputStream JavaDoc：

FilterOutputStream的flush方法调用它的flush方法基础输出流。

在ZipOutputStream的情况下，它被包裹在从ServletResponse.getOutputStream（）返回的流中，这是一个ServletOutputStream。事实证明，ServletOutputStream也没有实现flush（），它从OutputStream继承它，它在JavaDoc中特别提到：

 flush public void flush()
            throws IOExceptionFlushes 
 this output stream and forces any
 buffered output bytes to be written out. The general contract of flush
 is that calling it is an indication that, if any bytes previously
 written have been buffered by the implementation of the output stream,
 such bytes should immediately be written to their intended
 destination.  If the intended destination of this stream is an
 abstraction provided by the underlying operating system, for example a
 file, then flushing the stream guarantees only that bytes previously
 written to the stream are passed to the operating system for writing;
 it does not guarantee that they are actually written to a physical
 device such as a disk drive. 

  **The flush method of OutputStream does nothing.**

也许这是一个特例，我不知道。我知道flush（）已经存在了很长时间，并且没有人注意到那里的功能漏洞。

这让我想知道是否存在可以配置为移除1k缓冲区效果的流缓冲的操作系统组件。

related question有一个类似的问题，但直接使用文件而不是来自Java的Stream抽象，this answer指向涉及file buffering和{{3的MSDN文章}}

错误数据库中列出了file caching。

摘要

Java IO库依赖于Streams的OS实现。如果操作系统启用了缓存，则Java代码可能无法强制执行其他行为。对于Windows，您必须打开文件并发送非标准参数以允许直写缓存或无缓冲功能。我怀疑Java SDK是否提供了这样的特定于操作系统的选项，因为他们正在尝试创建平台通用API。

Answer 5

我完全无法重现你的问题。下面是您的代码，略有改动，在嵌入式Jetty服务器中运行。我在IntelliJ中运行它并从Firefox请求http://localhost:8080。正如所料，“保存或打开”对话框在1秒后弹出。选择“保存”并等待20秒会产生一个zip文件，该文件可以打开并包含20个单独的条目，名为foo＆lt; number＆gt;每个包含100个字符宽的单行，以＆lt; number＆gt;结尾。这是在带有JDK 1.6.0_26的Windows 7 Premium 64上。 Chrome的行为方式相同。另一方面，IE似乎通常等待5秒（500字节），虽然它立即显示对话框，而另一次它似乎等待9或10秒。在不同的浏览器中尝试：

import org.eclipse.jetty.server.Server;
import org.eclipse.jetty.servlet.ServletContextHandler;
import org.eclipse.jetty.servlet.ServletHolder;

import javax.servlet.ServletException;
import javax.servlet.http.*;
import java.io.IOException;
import java.util.zip.ZipEntry;
import java.util.zip.ZipOutputStream;

public class ZippingAndStreamingServlet {
    public static void main(String[] args) throws Exception {
        Server server = new Server(8080);
        ServletContextHandler context = new ServletContextHandler(ServletContextHandler.SESSIONS);
        context.setContextPath("/");
        server.setHandler(context);
        context.addServlet(new ServletHolder(new BufferingServlet()), "/*");
        server.start();
        System.out.println("Listening on 8080");
        server.join();
    }

    static class BufferingServlet extends HttpServlet {
        protected void doGet(HttpServletRequest request,
                             HttpServletResponse response) throws ServletException, IOException {
            ZipOutputStream _zos = new ZipOutputStream(response.getOutputStream());
            ZipEntry _ze;
            long startTime = System.currentTimeMillis();
            long _lByteCount = 0;
            int count = 1;
            response.setContentType("application/zip");
            response.setHeader("Content-Disposition", "attachment; filename=my.zip");
            while (_lByteCount < 2000) {
                _ze = new ZipEntry("foo" + count);
                _zos.putNextEntry(_ze);
                byte[] bytes = String.format("%100d", count++).getBytes();
                System.out.println("Sending " + bytes.length + " bytes");
                _zos.write(bytes);
                _lByteCount += bytes.length;
                sleep(1000);
                System.out.println("Zip: " + _lByteCount + " Time: " + ((System.currentTimeMillis() - startTime) / 1000));
                _zos.flush();
            }
            _zos.close();
        }

        private void sleep(int millis) {
            try {
                Thread.sleep(millis);
            } catch (InterruptedException e) {
                throw new IllegalStateException("Unexpected interrupt!", e);
            }
        }
    }
}

Answer 6

问题是默认情况下每个servlet实现都会缓冲数据，而SSE和其他自定义需求可能会立即需要数据。

解决方案是执行以下操作：

response.setBufferSize(1) // or some similar small number for such servlets.

这将确保更早地写出数据（导致性能损失）

Servlet缓冲响应尽管调用flush（）

6 个答案:

摘要