使用COSStream对象编辑PDF文件中的图像

时间:2014-03-16 02:06:12

标签: java pdf pdfbox javax.imageio

我尝试使用PDFBox库编辑PDF文件中的图像。我如何只为jpeg图像工作。 ImageIO.read()无法使用' png'解码图像。后缀。这是代码示例。所以我的问题是:如何对PDF文档中的所有类型的图像执行相同的操作?我还可以使用ImageIO还是需要其他方法?

public static void main(String[] args) throws Exception {

    PDDocument doc = PDDocument.load("docs/input1.pdf");

    // Get all images from first page 
    Map<String, PDXObjectImage> pageImages = ((PDPage) doc.getDocumentCatalog().getAllPages().get(0)).getResources().getImages();
    if (pageImages != null) 
    {
        // iterate by images
        Iterator<String> imageIter = pageImages.keySet().iterator();
        while (imageIter.hasNext()) 
        {
            String key =  imageIter.next();

            PDXObjectImage image = pageImages.get(key); // get page image object
            String suffix = image.getSuffix();  // get image suffix
            String imageName = key+'.'+suffix;  // compose image name

            System.out.print("process "+imageName+"... ");

            COSStream s = image.getCOSStream(); // get COSStream to manipulate
            BufferedImage img = ImageIO.read(s.getFilteredStream()); // get BufferedImage to edit

            if(img == null)
            {
                System.out.println("Can't decode");
            }
            else
            {
                paint(img.createGraphics()); // draw on it
                ImageIO.write(img, suffix, new File("out/"+imageName)); // write file to check result...

                // encode image back to COSStream
                OutputStream out = s.createFilteredStream();
                ImageIO.write(img, suffix, out);
                out.close();
                System.out.println("done");
            }
        }
    }
    doc.save("out/output1.pdf"); // save document
}   

/**
 * Draw red rectangular to test
 * @param g graphics
 */
public static void paint(Graphics2D g) {
    int xpoints[] = {25, 245, 245, 25};
    int ypoints[] = {25, 25, 545, 545};
    g.setColor(Color.RED);
    g.fillPolygon(xpoints, ypoints, 4);
}

1 个答案:

答案 0 :(得分:1)

最好不要使用PDXObjectImage流,而是创建PDXObjectImage的新实例并在资源集合中替换它。它是更通用和通用的方式。使用getRGBImage()将PDXObjectImage转换为BufferedImage和构造函数(PDPixelMap,PDJpeg等),将编辑后的结果转换回PDXObjectImage。请注意,由于错误,您仍然遇到JBIG2和Jpeg2000图像问题。这是我用来查找和转换文档中所有图像的代码示例:

// Recursive resource processor
// Here can be images inside in PDXObjectForm objects
protected static void processResources(PDResources resources, PDDocument doc, String filename) throws IllegalArgumentException, SecurityException, IOException, InstantiationException, IllegalAccessException, InvocationTargetException, NoSuchMethodException, JBIG2Exception, ColorSpaceException, ICCProfileException
{
    if(resources == null) return;
    Map<String, PDXObject> xObjects = resources.getXObjects();
    if (xObjects == null) return;

    // iterate by images
    Iterator<String> imageIter = xObjects.keySet().iterator();
    while (imageIter.hasNext()) 
    {
        String key =  imageIter.next();

        PDXObject o = xObjects.get(key);

        if(o instanceof PDXObjectImage)
            xObjects.put(key, processImage((PDXObjectImage) o /*, some additional parms... */));

        if(o instanceof PDXObjectForm)
            processResources(((PDXObjectForm) o).getResources(), doc, filename);
    }

    resources.setXObjects(xObjects);
}

注意最后调用resources.setXObjects() - 如果没有更改,您在resources.getXObjects()获取的集合中所做的更改将不会被写回文档。