Question

我已经坚持这个问题很长一段时间了。我已经谷歌搜索了这一点，并在SO中看到了与“chunked”相关的所有链接。所以，最后决定发布这个问题。让我简要说一下这个问题。我正在使用Java代码，它使用套接字从HTTPS读取响应。正确接收响应，除非传输编码“分块”，否则一切正常。我试图从套接字读取块状响应作为字节数组，当我将其转换为字符串时，响应是不可读的。我怀疑我在处理块数据时做错了什么。由于这个问题，当我尝试解压缩响应时，我也得到“Not in GZip format”异常。我用来处理块的代码是

    int chunkLength;

    do {
        String lengthLine = inStream.readLine();
        if (lengthLine == null) {
            return false;
        }
        chunkLength = Integer.parseInt(lengthLine.trim(), 16);
        if (chunkLength > 0) {
            byte[] chunk = new byte[chunkLength];
            int bytesRead = inStream.read(chunk);
            if (bytesRead < chunkLength) {
                return false;
            }
            //Burn a CR/LF
            inStream.readLine();
        }//if chunkLength
    } while (chunkLength > 0) ;
    return true;

由于我是新问题，所以我可能会遗漏一些（可能很多）细节，这些细节可能是您提供解决方案所必需的。在这种情况下请原谅我，如果您需要更多细节，请告诉我。任何帮助将不胜感激。欢呼声。

Answer 1

我在这段代码中看到了三个问题：

您没有考虑到单个块可能包含您没有跳过的扩展信息。这不常见，但它是规范的一部分，因此您应该为它编写代码。否则，如果您遇到Integer.parseInt()电话，inStreeam.read()电话会失败。
您没有读取整个块数据。由于您使用的是read()，因此它可能返回的字节数少于请求的字节数。如果发生这种情况，请不要停止读取，这是套接字的正常行为。您需要在循环中调用chunkLength，直到完全接收到try { String line; do { // read the chunk header line = inStream.readLine(); if (line == null) { return false; } // ignore any extensions after the chunk size int idx = line.indexOf(';'); if (idx != -1) { line = line.substring(0, idx); } // parse the chunk size int chunkLength = Integer.parseInt(line, 16); if (chunkLength < 0) { return false; } // has the last chunk been reached? if (chunkLength == 0) { break; } // read the chunk data byte[] chunk = new byte[chunkLength]; int offset = 0; do { int bytesRead = inStream.read(chunk, offset, chunkLength-offset); if (bytesRead < 0) { return false; } offset += bytesRead; } while (offset < chunkLength); // burn a CRLF at the end of the chunk inStream.readLine(); // now do something with the chunk... } while (true); // read trailing HTTP headers do { line = inStream.readLine(); if (line == null) { return false; } // has the last header been read? if (line.isEmpty()) { break; } // process the line as needed... } while (true); // all done return true; } catch (Exception e) { return false; }个字节数。只有在报告真实的错误时才停止阅读。
您没有读取在最后一个块之后出现的尾随HTTP标头。即使没有标头，仍然有一个CRLF终结器来结束HTTP响应。

尝试更像这样的事情：

chunk

话虽如此，请记住，分块并不否定TCP / HTTP允许流式字节的事实。每个chunk只是较大数据的一小部分。因此，不要尝试将每个chunk原样转换为String，或尝试将其解压缩为完整单元。您需要将charset收集到您选择的文件/容器中，然后在到达HTTP响应结束时将整个收集的数据作为一个整体进行处理。除非您将块推送到流式处理器，例如支持推送流的GZip解压缩器。如果确实需要将收集的数据转换为String，请确保使用HTTP响应的Content-Type标头中指定的charset（如果没有{{1，则使用适当的默认值）因此，收集的数据被正确解码为Java的本机UTF-16字符串编码。

获取分块的HTTPS响应无法正常工作

1 个答案: