如何查找和替换word文件doc和docx中的文本

时间:2016-12-01 13:52:02

标签: java

我想使用Java以doc格式和docx格式文件查找和替换文本。

我尝试过:我尝试将这些文件作为文本文件阅读,但没有成功。

我不知道如何继续或者还有什么可以尝试,有人可以给我指导吗?

5 个答案:

答案 0 :(得分:4)

这些文档格式是复杂的对象,您几乎肯定不想尝试解析自己。我强烈建议您查看apache poi库 - 这些库具有加载和保存doc和docx格式的功能,以及访问和修改文件内容的方法。

它们有详细记录,开源,目前维护和免费提供。

在摘要中使用这些库来:a)加载文件b)以编程方式浏览文件的内容并根据需要进行修改(即进行搜索和替换)和c)将其保存回磁盘。

答案 1 :(得分:3)

我希望这可以解决你的问题我的朋友。我已经为docx编写了它,可以使用apache.poi进行搜索和替换 我建议您阅读完整的Apache POI以获取更多信息

public class Find_Replace_DOCX {

     public static void main(String args[]) throws IOException,
       InvalidFormatException,
       org.apache.poi.openxml4j.exceptions.InvalidFormatException {
      try {

       /**
        * if uploaded doc then use HWPF else if uploaded Docx file use
        * XWPFDocument
        */
       XWPFDocument doc = new XWPFDocument(
         OPCPackage.open("d:\\1\\rpt.docx"));
       for (XWPFParagraph p : doc.getParagraphs()) {
        List<XWPFRun> runs = p.getRuns();
        if (runs != null) {
         for (XWPFRun r : runs) {
          String text = r.getText(0);
          if (text != null && text.contains("$$key$$")) {
           text = text.replace("$$key$$", "ABCD");//your content
           r.setText(text, 0);
          }
         }
        }
       }

       for (XWPFTable tbl : doc.getTables()) {
        for (XWPFTableRow row : tbl.getRows()) {
         for (XWPFTableCell cell : row.getTableCells()) {
          for (XWPFParagraph p : cell.getParagraphs()) {
           for (XWPFRun r : p.getRuns()) {
            String text = r.getText(0);
            if (text != null && text.contains("$$key$$")) {
             text = text.replace("$$key$$", "abcd");
             r.setText(text, 0);
            }
           }
          }
         }
        }
       }

       doc.write(new FileOutputStream("d:\\1\\output.docx"));
      } finally {

      }

     }

    }

答案 2 :(得分:1)

如果要使用Docx4J作为解析.docx的库,我创建了一个util库来进行搜索和替换:https://github.com/phip1611/docx4j-search-and-replace-util

WordprocessingMLPackage template = WordprocessingMLPackage.load(new FileInputStream(new File("document.docx")));;

// that's it; you can now save `template`, export it as PDF or whatever you want to do
Docx4JSRUtil.searchAndReplace(template, Map.of(
    "${NAME}", "Philipp",
    "${SURNAME}", "Schuster",
    "${PLACE_OF_BIRTH}", "GERMANY"
));

答案 3 :(得分:1)

适用于 Android Kotlin 用户。

private fun modifyDocFile(
    toReplace: String,
    newText: String,
    fileName : String,
    output : String
) {

    try {
        val document = XWPFDocument(OPCPackage.open(fileName))

        document.paragraphs.flatMap { it.runs }
            .forEach {
                it.getText(0).run {
                    if (contains(toReplace)) {
                        it.setText(replace(toReplace, newText),0)
                    }
                }
            }

        document.tables.flatMap {
            it.rows.filterNotNull()
                .flatMap { row: XWPFTableRow? -> row!!.tableCells }
                .flatMap { cell -> cell.paragraphs }
                .flatMap { paragraph -> paragraph.runs }
        }.forEach {
            it.getText(0).run {
                if (contains(toReplace)) {
                    it.setText(replace(toReplace, newText),0)
                }
            }
        }


        document.write(FileOutputStream(output))

    } catch (e: IOException) {
        e.printStackTrace()
    }

}

答案 4 :(得分:0)

我正在使用这样的代码,它看起来非常好,谢谢。

import java.io.FileOutputStream;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.nio.file.StandardCopyOption;
import java.util.List;
import java.util.Map;

import org.apache.commons.collections4.map.HashedMap;
import org.apache.poi.openxml4j.opc.OPCPackage;
import org.apache.poi.xwpf.usermodel.XWPFDocument;
import org.apache.poi.xwpf.usermodel.XWPFParagraph;
import org.apache.poi.xwpf.usermodel.XWPFRun;

public class TesteMain {

public static void main(String[] args) {

    Map<String, String> change = new HashedMap<>();
    change.put("nomeContratante", "Jo Luis Pinto"); // word to be replaced
    change.put("customer1", "Maikon");
    String pathOriginal = "C:\\testeDocx\\"; // path template
    String templateDoc = "doc1.docx"; // original document will not be changed
    changeDocx(change, pathOriginal, templateDoc);

}

private static void changeDocx(Map<String, String> change, String pathOriginal, String templateDoc) {
    try {
        // finds the path of the operating system temp folder to create a temporary file
        String tempPath = System.getenv("TEMP") + "\\temp.docx";
        Path dirOrigem = Paths.get(pathOriginal + templateDoc);
        Path dirDestino = Paths.get(tempPath);
        Files.copy(dirOrigem, dirDestino, StandardCopyOption.REPLACE_EXISTING); // copy the template to temporary
                                                                                // directory

        try (XWPFDocument doc = new XWPFDocument(OPCPackage.open(tempPath))) {
            for (XWPFParagraph p : doc.getParagraphs()) {
                List<XWPFRun> runs = p.getRuns();
                if (runs != null) {
                    for (XWPFRun r : runs) {
                        String text = r.getText(0);
                        for (Map.Entry<String, String> entry : change.entrySet()) { // scrolls the map
                            if (text != null && text.contains(entry.getKey())) {
                                text = text.replace(entry.getKey(), entry.getValue()); // replaces the values
                                r.setText(text, 0);
                            }
                        }
                    }
                }
            }

            /*
             * table change for (XWPFTable tbl : doc.getTables()) { for (XWPFTableRow row :
             * tbl.getRows()) { for (XWPFTableCell cell : row.getTableCells()) { for
             * (XWPFParagraph p : cell.getParagraphs()) { for (XWPFRun r : p.getRuns()) {
             * String text = r.getText(0); if (text != null && text.contains("$$key$$")) {
             * text = text.replace("$$key$$", "abcd"); r.setText(text, 0); } } } } } }
             */

            // saves in the original directory a new file with a modified name
            doc.write(new FileOutputStream(pathOriginal + "changed_" + templateDoc)); 
        }
    } catch (Exception e) {
        System.out.println(e.getMessage());
    }
}

}