For Loop速度很慢-Java

时间:2018-12-06 12:48:00

标签: java pdfbox

我基本上想使用Apache pdfbox加载pdf,并将其转换为每一页的base64列表。

我尝试了以下代码,但是它非常慢。我不需要转换为图像,我只想转换为base64即可传递给前端

PDDocument document = PDDocument.loadNonSeq(new File("Random.pdf"), null);
@SuppressWarnings("unchecked")
List<PDPage> pdPages = document.getDocumentCatalog().getAllPages();
int page = 0;
List<String> base64DocumentPages = new ArrayList<>();
for (PDPage pdPage : pdPages)
{ 
    ++page;           
    BufferedImage img = pdPage.convertToImage(BufferedImage.TYPE_INT_RGB, 300); // this is slow
    ByteArrayOutputStream os = new ByteArrayOutputStream();
    ImageIOUtil.writeImage(img, ".png", os);
    String base64Page = Base64.getEncoder().encodeToString(os.toByteArray());
    base64DocumentPages.add(URLEncoder.encode(base64Page, "UTF-8"));
}
document.close();

我正在使用PDFBOX来循环浏览页面,但是如果您了解得更多,我可以使用任何东西。

PS:我真的需要用某种数组分隔页面的Base64数据

1 个答案:

答案 0 :(得分:0)

您确定它的convertToImage方法吗?在我们的例子中,writeIamge方法需要最长的时间。 问题是您使用标准的PNGWriter。这有一个缺陷/错误,它总是在时间上花费最好的压缩。对于Java 9,该问题已修复,但在此之前,有backported版可用。那你需要做什么?

(1)将以下依赖项添加到您的maven projet中(如果不使用maven,则手动添加)

<dependency>
    <groupId>net.gredler</groupId>
    <artifactId>jdk9-png-writer-backport</artifactId>
    <version>1.0.0</version>
</dependency>

(2)确保使用PdfBox 2.X

(3)更改您的转换代码:

private static void convertMethod2(File pdf) {
        try (final PDDocument document = PDDocument.load(pdf)) {
            PDFRenderer pdfRenderer = new PDFRenderer(document);
            List<String> base64DocumentPages = new ArrayList<>();               

            for (int page = 0; page < document.getNumberOfPages(); ++page) {
                BufferedImage bim = pdfRenderer.renderImageWithDPI(page, 150, ImageType.RGB);

                ByteArrayOutputStream baos = new ByteArrayOutputStream();
                PNGImageWriterBackport writer = chosePngWriter();

                if(writer!=null) {
                    try (ImageOutputStream stream = new MemoryCacheImageOutputStream(baos)) {
                        writer.setOutput(stream);
                        writer.write(null,new IIOImage(bim, null, null), getImageParams(writer));
                    }
                    finally {
                        writer.dispose();
                    }
                }
                else {
                    System.err.println("PNGImageWriterBackport not found! Aborting");
                }
                String base64Page = Base64.getEncoder().encodeToString(os.toByteArray());
                base64DocumentPages.add(URLEncoder.encode(base64Page, "UTF-8"));
            }
            document.close();
        }
        catch (IOException e) {
            //handle exception
        }
    }

    private static PNGImageWriterBackport chosePngWriter() {
        Iterator<ImageWriter> imageWriters = ImageIO.getImageWritersByFormatName("png");

        ImageWriter writer = null;
        while(imageWriters.hasNext()) {

            writer = imageWriters.next();
            if (writer instanceof PNGImageWriterBackport) {
                return (PNGImageWriterBackport)writer;
            }
        }
        return null;
    }

    private static ImageWriteParam getImageParams(PNGImageWriterBackport writer) {
        ImageWriteParam writeParam = writer.getDefaultWriteParam();
        //set compression mode which wasn't possible before
        writeParam.setCompressionMode(ImageWriteParam.MODE_EXPLICIT);
        //0.0f highest compression, slowest
        //1.0f lowest compression, fastest
        writeParam.setCompressionQuality(0.9f);

        return writeParam;
    }

(4)当然可以将DPI降低到例如150也可以加快这一过程。但是我知道这并不总是可能的...

以最小的png文件大小为代价,这会更快很多...