应用错误收集

我使用以下代码使用PdfBox获取任何PDF文件的全部文本内容：

    private static void textExtraction() throws FileNotFoundException, UnsupportedEncodingException, IOException 

{
        String encoding = null;
        String outputFile = "path";


        Writer output = new OutputStreamWriter(new FileOutputStream( outputFile ) );            
        PDFTextStripper stripper = new PDFTextStripper(encoding);
        stripper.writeText( document, output );

    }

这段代码完美无缺。但问题是如何提取文本并知道它在哪里？我的意思是，例如，我想逐页提取文本并将其写入不同的文件，或者例如我希望它查找关键字，然后提取关键字发生的那些部分，告诉我它发生的位置等。

使用PDFBox从PDF文件中提取文本

0 个答案: