有没有办法将 XFA PDF转换为Apache PDFBox的一组图片(png或jpeg)?
我使用的版本(1.8.6)应该支持XFA。
要转换的PDF文件是动态PDF表单(XFA)。转换静态PDF表单不会产生任何问题。
我通过调用Page.convertToImage()
方法获得了PDFBox V 1.8.6的成功。
我尝试使用XFA PDF生成了这张图片:
以下是我用来测试PDF转换的代码:
public void convertToImages(File sourceFile, File destinationDir){
if (!destinationDir.exists()) {
destinationDir.mkdir();
System.out.println("Folder Created -> "+ destinationDir.getAbsolutePath());
}
if (sourceFile.exists()) {
System.out.println("Images copied to Folder: "+ destinationDir.getName());
PDDocument document = null;
try {
//The classical way to lod the PDF document doesn't work here
//document = PDDocument.load(sourceFile);
File scratch = new File(destinationDir, "scratch");
if(scratch.exists())scratch.delete();
document = PDDocument.loadNonSeq(sourceFile, new RandomAccessFile(scratch, "rw"));
//Doesn't seem to have an effect in my case but I keep it ;-)
document.setAllSecurityToBeRemoved(true);
@SuppressWarnings("unchecked")
List<PDPage> list = document.getDocumentCatalog().getAllPages();
System.out.println("Total files to be converted -> "+ list.size());
String fileName = sourceFile.getName();
int pos = fileName.lastIndexOf('.');
fileName = fileName.substring(0, pos);
int pageNumber = 1;
for (PDPage page : list) {
File outputfile = new File(destinationDir, fileName +"_"+ pageNumber +".png");
try {
BufferedImage image = page.convertToImage();
ImageIO.write(image, "png", outputfile);
pageNumber++;
if(outputfile.exists()){
System.out.println("Image Created -> "+ outputfile.getName());
} else {
System.out.println("Image NOT Created -> "+ outputfile.getName());
}
} catch (Exception e) {
System.out.println("Error while creating image file "+ outputfile.getName());
e.printStackTrace();
}
}
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} finally {
if(document != null){
try {
document.close();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
System.out.println("Converted Images are saved at -> "+ destinationDir.getAbsolutePath());
} else {
System.err.println(sourceFile.getName() +" File does not exists");
}
}
在这种情况下有什么特别的事吗?
我尝试了GIMP,尽管它应该在服务器上使用,但它也不能用于动态PDF。
我也尝试过ImageMagick,但它根本没用。因为我很惊讶它可以解决任何问题,我放弃了进一步的调查。