我使用pdfbox版本1.8.3从PDF中提取字体。对于一些PDF,有时,我收到InexOutOfBounds异常。这是相同的堆栈跟踪:
java.lang.IndexOutOfBoundsException: Index: 2218, Size: 0
at java.util.ArrayList.rangeCheck(ArrayList.java:653)
at java.util.ArrayList.get(ArrayList.java:429)
at org.apache.pdfbox.io.RandomAccessBuffer.seek(RandomAccessBuffer.java:84)
at org.apache.pdfbox.io.RandomAccessFileInputStream.read(RandomAccessFileInputStream.java:96)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
at java.io.FilterInputStream.read(FilterInputStream.java:83)
at java.io.PushbackInputStream.read(PushbackInputStream.java:139)
at org.apache.fontbox.cmap.CMapParser.parseNextToken(CMapParser.java:386)
at org.apache.fontbox.cmap.CMapParser.parse(CMapParser.java:118)
at org.apache.pdfbox.pdmodel.font.PDFont.parseCmap(PDFont.java:602)
at org.apache.pdfbox.pdmodel.font.PDSimpleFont.extractToUnicodeEncoding(PDSimpleFont.java:458)
at org.apache.pdfbox.pdmodel.font.PDSimpleFont.determineEncoding(PDSimpleFont.java:426)
at org.apache.pdfbox.pdmodel.font.PDType1Font.determineEncoding(PDType1Font.java:269)
at org.apache.pdfbox.pdmodel.font.PDFont.<init>(PDFont.java:193)
at org.apache.pdfbox.pdmodel.font.PDSimpleFont.<init>(PDSimpleFont.java:88)
at org.apache.pdfbox.pdmodel.font.PDType1Font.<init>(PDType1Font.java:152)
at org.apache.pdfbox.pdmodel.font.PDFontFactory.createFont(PDFontFactory.java:92)
at org.apache.pdfbox.pdmodel.PDResources.getFonts(PDResources.java:204)
这不是PDF文件本身的问题。偶然出现异常 - 对于相同的PDF,有时候我会得到例外,有时候不会。
非常感谢任何帮助。