Question

这是我的代码，imageFile是pdf文件，目的是为图像文件获取Base64编码文件。我使用的是Java6，无法升级到Java7

Base64Inputstream的类型为org.apache.commons.codec.binary.Base64InputStream

private File toBase64(File imageFile) throws Exception {
         LOG.info(this.getClass().getName() + " toBase64 method is called");
         System. out.println("toBase64 is called" );
         Base64InputStream in = new Base64InputStream(new FileInputStream(imageFile), true );
         File f = new File("/root/temp/" + imageFile.getName().replaceFirst("[.][^.]+$" , "" ) + "_base64.txt" );
         Writer out = new FileWriter(f);
         copy(in, out);
         return f;
  }

 private void copy(InputStream input, Writer output)
            throws IOException {
        InputStreamReader in = new InputStreamReader(input);
        copy(in, output);
    }

 private int copy(Reader input, Writer output) throws IOException {
        long count = copyLarge(input, output);
        if (count > Integer.MAX_VALUE) {
            return -1;
        }
        return (int) count;
    }

 private static final int DEFAULT_BUFFER_SIZE = 1024 * 4;

 private long copyLarge(Reader input, Writer output) {
        char[] buffer = new char[DEFAULT_BUFFER_SIZE];
        long count = 0;
        int n = 0;
        try {
            while (-1 != (n = input.read(buffer))) {
                output.write(buffer, 0, n);
                count += n;
                System.out.println("Count: " + count);
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
        return count;
    }

我使用IOUtils.copy(InputStream input, Writer output)方法。但是对于一些pdf文件（注意，不是全部），它会抛出异常。因此，在调试过程中，我在本地复制了IOUtils.copy代码，并在Count: 2630388之后抛出异常。这是堆栈跟踪：

Root Exception stack trace:
java.io.IOException: Underlying input stream returned zero bytes
    at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:268)
    at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:306)
    at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:158)

在什么情况下，上面的阻止抛出异常：

while (-1 != (n = input.read(buffer))) {
                    output.write(buffer, 0, n);
                    count += n;
                    System.out.println("Count: " + count);
                }

请帮助我了解原因以及如何解决问题

Answer 1

你不应该使用面向 text 而不是二进制的Reader / Writer，至少没有编码。他们使用编码。 PDF是二进制的。显式给定，或默认的OS编码（不可移植）。

InputStream使用readFully。

然后始终做close()。在这种情况下，copy方法（可能离开呼叫者很近）至少可以调用flush()。

在Java 7中，已经存在copy，但需要一个Path和一个额外的选项。

private File toBase64(File imageFile) throws Exception {
    LOG.info(this.getClass().getName() + " toBase64 method is called");
    System.out.println("toBase64 is called");
    Base64InputStream in = new Base64InputStream(new FileInputStream(imageFile),
        true);
    File f = new File("/root/temp/" + imageFile.getName()
        .replaceFirst("[.][^.]+$", "") + "_base64.txt");

    Files.copy(in, f.toPath(), StandardCopyOption.REPLACE_EXISTING);
    in.close();

    return f;
}

java.io.IOException的原因是什么：底层输入流返回零字节

1 个答案: