我在单独的线程中将PDF转换为PNG。我担心的是 - 它需要大量的内存。我在渲染过程中进行了堆转储:
问题是 - 当PDF文档的每页在内存中有3个对象时,它是标准情况吗? 根据堆转储: INT [], 的BufferedImage IntegerInterleavedRaster。 可能我们可以以某种方式摆脱内存中不必要的数据,例如不创建int []?我遵循Apache PDFBox文档中的建议,但它对减少内存使用没有帮助。
代码非常简单:
try (final InputStream instream = Files.newInputStream(Paths.get(file));
final PDDocument document = PDDocument.load(instream);) {
final PDFRenderer pdfRenderer = new PDFRenderer(document);
final int pageCount = document.getNumberOfPages();
ExecutorService executor = Executors.newFixedThreadPool(pageCount);
List<Future<Path>> futures = new ArrayList<>();
for (int page = 0; page < pageCount; page++) {
futures.add(executor.submit(new ConvertPdfPageToPngCallable(pdfRenderer, page, destination)));
}
for (Future<Path> future: futures){
try {
retval.add(future.get());
} catch (InterruptedException | ExecutionException e) {
executor.shutdownNow();
...
}
}
private class ConvertPdfPageToPngCallable implements Callable<Path>{
...
@Override
public Path call() throws Exception {
final BufferedImage bim = pdfRenderer.renderImageWithDPI(page, 300, ImageType.RGB);
ImageIOUtil.writeImage(bim, destination, 300);
return Paths.get(destination);
}
}