Question

我有一些代码循环遍历文件夹中的所有pdf并将其解析为文本。我现在的问题是如何将每个新文本文件保存为单个文档。

这是我的解析器代码：

PDFTextStripper pdfStripper = null;
PDDocument pdDoc = null;
COSDocument cosDoc = null;
String parsedText=""; // append the text to this every time
File folder = new File("/yourFolder"); // put all the pdf files in a folder
File[] listOfFiles = folder.listFiles(); // get all the files as an array

for (File file : listOfFiles) { // cycle through this array 
    if (file.isFile()) { // for every file
         try { //do the same 
             PDFParser parser = new PDFParser(new FileInputStream(file));
             parser.parse();
             cosDoc = parser.getDocument();
             pdfStripper = new PDFTextStripper();
             pdDoc = new PDDocument(cosDoc);
             pdfStripper.setStartPage(1);
             pdfStripper.setEndPage(pdDoc.getNumberOfPages()); // if always till the last page
             parsedText += pdfStripper.getText(pdDoc) +  System.lineSeparator(); // append the text to the String

我不确定如何单独保存新文本文件。我打算使用这个

将解析后的文件写入for循环内的文本文件

try (FileWriter newfile = new FileWriter("C:/Placeholder.txt")) {
newfile.write(parsedText); 
newfile.flush();
}

但是这会将所有文件保存在一个文本文件中，而不是将它们保存为单个文件。

我觉得答案很简单，但我被困住了。

任何帮助？

如何单独保存多个已分析的文件

0 个答案: