我试图将.doc
转换为.pdf
,但我得到了例外,而我不知道如何修复它。
java.io.IOException: Missing root object specification in trailer
at org.apache.pdfbox.pdfparser.COSParser.parseTrailerValuesDynamically(COSParser.java:2042)
这是抛出异常的地方:
PDDocument pdfDocument = PDDocument.load(convertDocToPdf(documentInputStream));
这是我的转换方法:
private byte[] convertDocToPdf(InputStream documentInputStream) throws Exception {
Document document = null;
WordExtractor we = null;
ByteArrayOutputStream out = null;
byte[] documentByteArray = null;
try {
document = new Document();
POIFSFileSystem fs = new POIFSFileSystem(documentInputStream);
HWPFDocument doc = new HWPFDocument(fs);
we = new WordExtractor(doc);
out = new ByteArrayOutputStream();
PdfWriter writer = PdfWriter.getInstance(document, out);
Range range = doc.getRange();
document.open();
writer.setPageEmpty(true);
document.newPage();
writer.setPageEmpty(true);
String[] paragraphs = we.getParagraphText();
for (int i = 0; i < paragraphs.length; i++) {
org.apache.poi.hwpf.usermodel.Paragraph pr = range.getParagraph(i);
paragraphs[i] = paragraphs[i].replaceAll("\\cM?\r?\n", "");
document.add(new Paragraph(paragraphs[i]));
}
documentByteArray = out.toByteArray();
} catch (Exception ex) {
ex.printStackTrace(System.out);
throw new Exception(STATE.FAILED_CONVERSION.name());
} finally {
document.close();
try {
we.close();
out.close();
} catch (IOException e) {
e.printStackTrace();
}
}
return documentByteArray;
}
答案 0 :(得分:0)
您使用iText类并执行
documentByteArray = out.toByteArray();
在完成文档之前
document.close();
因此,documentByteArray
仅包含PDFBox抱怨的不完整PDF。