Question

我想逐行显示两个.doc文件之间的差异。我用.txt文件完成了它，它工作得很完美。为此，我使用了以下代码：

        FileReader File1Reader = new FileReader(File1.getPath());
        FileReader File2Reader = new FileReader(File2.getPath());

        // Create Buffered Object.
        BufferedReader File1BufRdr = new BufferedReader(File1Reader);
        BufferedReader File2BufRdr = new BufferedReader(File2Reader);

        // Get the file contents into String Variables.
        String File1Content = File1BufRdr.readLine();
        String File2Content = File2BufRdr.readLine();

        //New String Builder
        StringBuilder buffer = new StringBuilder();

有没有办法逐行阅读doc文件。我正在使用以下代码从doc文件中读取，但这不是逐行的。这是代码：

import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import org.apache.poi.xwpf.extractor.XWPFWordExtractor;
import org.apache.poi.xwpf.usermodel.XWPFDocument;
import org.apache.poi.hwpf.HWPFDocument;
import org.apache.poi.hwpf.extractor.WordExtractor;

public class read_From_Doc_Docx {
    public static void main(String[] args) {

            //Alternate between the two to check what works.
        //String FilePath = "D:\\Users\\username\\Desktop\\Doc1.docx";
        String FilePath = "/Users/esna786/Removal of Redundancy.docx";
        FileInputStream fis;

        if (FilePath.substring(FilePath.length() - 1).equals("x")) { //is a docx
            try {
                fis = new FileInputStream(new File(FilePath).getAbsolutePath());
                XWPFDocument doc = new XWPFDocument(fis);
                XWPFWordExtractor extract = new XWPFWordExtractor(doc);
                System.out.println(extract.getText());
            } catch (IOException e) {

                e.printStackTrace();
            }
        } else { //is not a docx
            try {
                fis = new FileInputStream(new File(FilePath));
                HWPFDocument doc = new HWPFDocument(fis);
                WordExtractor extractor = new WordExtractor(doc);
                System.out.println(extractor.getText());
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
    }
}

Answer 1

只需使用getParagraphText（）方法而不是getText（）。

如何使用所有必要的jar文件在java中逐行读取.doc文件？

1 个答案: