我尝试下面的代码来阅读pdf:
val byteArrayOutPutStream: ByteArrayOutputStream = new ByteArrayOutputStream
val file = new File(path + name)
val inputStream = new FileInputStream(file)
val document = new PDFDocumentReader(inputStream)
var result: List[BufferedImage] = Nil
val numPgs = document.getNumberOfPages
for (i <- 0 until numPgs) {
val pageDetail = new PageDetail("", "", i, "")
val resourceDetails = document.getPageAsImage(pageDetail)
val image = ImageIO.read(new ByteArrayInputStream(resourceDetails.getBytes()))
result ::= image
}
但在特定的pdf中,我遇到以下错误:
Oct 24, 2013 10:48:01 AM org.apache.pdfbox.pdmodel.font.PDTrueTypeFont getawtFont
INFO: Can't read the embedded font ESNOYH+Calibri-Bold
Oct 24, 2013 10:48:01 AM org.apache.pdfbox.util.PDFStreamEngine processOperator
WARNING: java.lang.NullPointerException
java.lang.NullPointerException
at org.apache.pdfbox.pdmodel.font.PDTrueTypeFont.getawtFont(PDTrueTypeFont.java:427)
at org.apache.pdfbox.pdmodel.font.PDSimpleFont.drawString(PDSimpleFont.java:97)
at org.apache.pdfbox.pdfviewer.PageDrawer.processTextPosition(PageDrawer.java:190)
at org.apache.pdfbox.util.PDFStreamEngine.processEncodedText(PDFStreamEngine.java:494)
at org.apache.pdfbox.util.operator.ShowTextGlyph.process(ShowTextGlyph.java:62)
at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:551)
at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:274)
at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:251)
at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:225)
at org.apache.pdfbox.pdfviewer.PageDrawer.drawPage(PageDrawer.java:107)
at org.apache.pdfbox.pdmodel.PDPage.convertToImage(PDPage.java:722)
at org.apache.pdfbox.pdmodel.PDPage.convertToImage(PDPage.java:693)
at org.dopdf.document.read.pdf.PDFPage.asImage(PDFPage.java:59)
我该如何解决这个问题?
答案 0 :(得分:0)
显然已经修复了已知问题PDFBOX-490: Pdf Printing of text from embedded fonts。但固定版本2.0.0我认为尚未发布。