应用错误收集

我目前正在使用PDFBox库解析PDF，如下所示：

File f = new File("myFile.pdf");

PDDocument doc = PDDocument.load(f);
PDFTextStripper pdfStripper = new PDFLayoutTextStripper(); //For better printing in console I used a PDFLayoutTextStripper
String text = pdfStripper.getText(doc);
System.out.println(text);
doc.close();

我得到了一个非常漂亮的pdf。我的pdf文件将具有以下结构：

我的超级pdf文件，这是第一个

someKey1 someValue1

someKey2 someValue2

someKey3 someValue3

...

someKey1 someValue4

someKey2 someValue5

someKey3 someValue6

...

这里有一些标题

这将是我的下一对

someKey4 someValue7

...

是否有任何图书馆可以为我提供所有values，例如键someKey1？或者是否有更好的解决方案来解析Java中的PDF？

使用PDFBox解析具有键值对的PDF文件

0 个答案: