Question

我正在试图弄清楚为什么这段特殊代码对我不起作用。我有一个applet，应该读取.pdf并用pdf-renderer库显示它，但出于某种原因，当我读入位于我服务器上的.pdf文件时，它们最终会被破坏。我已经通过再次写出文件来测试它。

我尝试在IE和Firefox中查看applet，然后出现损坏的文件。有趣的是，当我尝试在Safari（对于Windows）中查看applet时，该文件实际上很好！我理解JVM可能会有所不同，但我仍然迷失方向。我已经用Java 1.5编译了。 JVM是1.6。读取文件的片段如下。

public static ByteBuffer getAsByteArray(URL url) throws IOException {
        ByteArrayOutputStream tmpOut = new ByteArrayOutputStream();

        URLConnection connection = url.openConnection();
        int contentLength = connection.getContentLength();
        InputStream in = url.openStream();
        byte[] buf = new byte[512];
        int len;
        while (true) {
            len = in.read(buf);
            if (len == -1) {
                break;
            }
            tmpOut.write(buf, 0, len);
        }
        tmpOut.close();
        ByteBuffer bb = ByteBuffer.wrap(tmpOut.toByteArray(), 0,
                                        tmpOut.size());
        //Lines below used to test if file is corrupt
        //FileOutputStream fos = new FileOutputStream("C:\\abc.pdf");
        //fos.write(tmpOut.toByteArray());
        return bb;
}

我一定是在遗漏一些东西，而且我一直在试图想出来。任何帮助是极大的赞赏。感谢。

修改为了进一步澄清我的情况，我在阅读之后使用片段和之后的文件中的差异是，我在阅读后输出的文件明显小于它们原来的。打开它们时，它们不会被识别为.pdf文件。没有任何例外被抛出我忽略，我试着冲洗无济于事。

此代码段在Safari中有效，这意味着文件全部读取，大小没有差异，可以使用任何.pdf阅读器打开。在IE和Firefox中，文件总是最终被破坏，始终保持相同的较小尺寸。

我监视了len变量（读取59kb文件时），希望看到每个循环读入多少字节。在IE和Firefox中，在18kb，in.read（buf）返回-1，好像文件已经结束。 Safari不会这样做。

我会坚持下去，我感谢到目前为止的所有建议。

Answer 1

如果这些小变化有所不同，请尝试以下方法：

public static ByteBuffer getAsByteArray(URL url) throws IOException {
    URLConnection connection = url.openConnection();
    // Since you get a URLConnection, use it to get the InputStream
    InputStream in = connection.getInputStream();
    // Now that the InputStream is open, get the content length
    int contentLength = connection.getContentLength();

    // To avoid having to resize the array over and over and over as
    // bytes are written to the array, provide an accurate estimate of
    // the ultimate size of the byte array
    ByteArrayOutputStream tmpOut;
    if (contentLength != -1) {
        tmpOut = new ByteArrayOutputStream(contentLength);
    } else {
        tmpOut = new ByteArrayOutputStream(16384); // Pick some appropriate size
    }

    byte[] buf = new byte[512];
    while (true) {
        int len = in.read(buf);
        if (len == -1) {
            break;
        }
        tmpOut.write(buf, 0, len);
    }
    in.close();
    tmpOut.close(); // No effect, but good to do anyway to keep the metaphor alive

    byte[] array = tmpOut.toByteArray();

    //Lines below used to test if file is corrupt
    //FileOutputStream fos = new FileOutputStream("C:\\abc.pdf");
    //fos.write(array);
    //fos.close();

    return ByteBuffer.wrap(array);
}

如果应用程序仍在运行或突然终止，您忘记关闭fos，这可能会导致文件缩短。此外，我添加了使用适当的初始大小创建ByteArrayOutputStream。（否则Java将不得不重复分配一个新数组并复制，分配一个新数组并复制，这很昂贵。）将值16384替换为更合适的值。对于PDF来说16k可能很小，但我不知道你希望下载的“平均”大小如何。

由于您使用toByteArray()两次（即使其中一个在诊断代码中），我将其分配给变量。最后，虽然它不应该有任何区别，但是当你在ByteBuffer中包装整个数组时，你只需要提供字节数组本身。提供偏移0并且长度是多余的。

请注意，如果您以这种方式下载大型 PDF文件，请确保您的JVM正在运行时具有足够大的堆，以至于您有足够的空间可以读取您希望读取的最大文件大小的几倍。您正在使用的方法将整个文件保存在内存中，只要您能负担得起内存就可以了。：）

Answer 2

在关闭flush()流之前是否尝试过tmpOut以确保写出所有字节？

Answer 3

我以为我和你有同样的问题，但事实证明我的问题是我认为你总是得到完整的缓冲区，直到你什么也得不到。但你不要这么认为。网上的示例（例如java2s/tutorial）使用BufferedInputStream。但这对我没有任何影响。

您可以检查是否确实在循环中获得了完整的文件。比问题在于ByteArrayOutputStream。

Java：将URL中的pdf文件读取到applet中的Byte数组/ ByteBuffer中

3 个答案: