我在Java应用程序中使用pdfbox-2.0.9将PDF文件转换为html。但是我得到
java.lang.UnsupportedOperationException
at org.apache.pdfbox.pdmodel.graphics.color.PDPattern.toRGB(PDPattern.java:95)
at org.fit.pdfdom.PathDrawer.pdfColorToColor(PathDrawer.java:133)
at org.fit.pdfdom.PathDrawer.clearPathGraphics(PathDrawer.java:79)
at org.fit.pdfdom.PathDrawer.drawPath(PathDrawer.java:59)
at org.fit.pdfdom.PDFDomTree.createPathImage(PDFDomTree.java:403)
at org.fit.pdfdom.PDFDomTree.renderPath(PDFDomTree.java:251)
at org.fit.pdfdom.PDFBoxTree.processOperator(PDFBoxTree.java:499)
at org.apache.pdfbox.contentstream.PDFStreamEngine.processStreamOperators(PDFStreamEngine.java:503)
at org.apache.pdfbox.contentstream.PDFStreamEngine.processStream(PDFStreamEngine.java:477)
at org.apache.pdfbox.contentstream.PDFStreamEngine.showForm(PDFStreamEngine.java:181)
at org.apache.pdfbox.contentstream.operator.DrawObject.process(DrawObject.java:65)
at org.apache.pdfbox.contentstream.PDFStreamEngine.processOperator(PDFStreamEngine.java:848)
at org.fit.pdfdom.PDFBoxTree.processOperator(PDFBoxTree.java:542)
at org.apache.pdfbox.contentstream.PDFStreamEngine.processStreamOperators(PDFStreamEngine.java:503)
at org.apache.pdfbox.contentstream.PDFStreamEngine.processStream(PDFStreamEngine.java:477)
at org.apache.pdfbox.contentstream.PDFStreamEngine.processPage(PDFStreamEngine.java:150)
at org.apache.pdfbox.text.LegacyPDFStreamEngine.processPage(LegacyPDFStreamEngine.java:139)
at org.apache.pdfbox.text.PDFTextStripper.processPage(PDFTextStripper.java:391)
at org.fit.pdfdom.PDFBoxTree.processPage(PDFBoxTree.java:208)
at org.apache.pdfbox.text.PDFTextStripper.processPages(PDFTextStripper.java:319)
at org.apache.pdfbox.text.PDFTextStripper.writeText(PDFTextStripper.java:266)
at org.fit.pdfdom.PDFDomTree.createDOM(PDFDomTree.java:218)
at com.demo.pdf.converter.PdfProcessor.convertToHtml(PdfProcessor.java:87)
我尝试转换的pdf可以从here访问。
答案 0 :(得分:2)
PDF2Dom v1.9中已解决此问题。我尝试使用此版本提供的pdf文件,并且已将其正确转换。
没有异常。
请通过将PDF2Dom更新到v1.9进行确认
您可以找到最新的依赖项here。