Apache POI搜索​​.txt并替换.docm

时间:2017-03-28 15:04:42

标签: java regex replace apache-poi docx

我正在使用 ApachePOI库.docm文件中搜索和替换。 我有.txt从PDF文件(与.docm文件相同的内容)的每个页面中提取了第一个 3个字

我希望从.txt获取每一行,并在匹配时在.docm内搜索,在该特定3个字之前,我将添加html页面标记,如{{1} }

EDITED(30.03.2017)我的<page>(pdfpagenumber)</page>看起来像:

searchedTemp.txt

EDIT(30.03.2017)APACHE POI中的代码如下所示:

Dieses Buch wendet
Einführung in
b) mit ihm 
Straftäter werden 
etc.

即使我删除package dgi.writetags; import org.apache.poi.openxml4j.exceptions.InvalidFormatException; import org.apache.poi.xwpf.usermodel.XWPFDocument; import org.apache.poi.xwpf.usermodel.XWPFParagraph; import org.apache.poi.xwpf.usermodel.XWPFRun; import java.io.BufferedReader; import java.io.File; import java.io.FileInputStream; import java.io.FileOutputStream; import java.io.FileReader; import java.io.IOException; import java.util.Scanner; public class App { public static String SOURCE_FILE = "D:" + File.separator + "ReadPageTags" + File.separator + "Doc" + File.separator + "test.docm"; public static String OUTPUT_FILE = "D:" + File.separator + "ReadPageTags" + File.separator + "Doc" + File.separator + "output.docm"; public static String OUTPUT_SEARCH = "D:" + File.separator + "ReadPageTags" + File.separator + "GeneratedList" + File.separator + "searchTemp.txt"; public static String actual; //MAIN METHOD public static void main(String[] args) throws Exception { //Application instances App docm = new App(); System.out.println("[DOCM] App started."); //Search inside the TXT //docm.searchlist(); //CheckStart //System.out.print(docm.searchlist()); //CheckEnd //Replace text in DOCM docm.replaceText(); //Exit after succeed //System.exit(0); } //replace text method private void replaceText() throws InvalidFormatException, IOException{ App wt = new App(); String[] lines = wt.searchlist().split("\n"); XWPFDocument doc = new XWPFDocument(new FileInputStream(SOURCE_FILE)); for (XWPFParagraph p : doc.getParagraphs()){ for (XWPFRun r : p.getRuns()){ String text = r.toString(); for(String s : lines){ text = text.replace(s, "pagetag "+s); r.setText(text,0); } } System.out.println("[DOCM] Working..."); } doc.write(new FileOutputStream(OUTPUT_FILE)); doc.close(); System.out.println("[DOCM] Finished"); } //search into text file method //for the latest method return STRING private String searchlist() throws IOException, InvalidFormatException{ BufferedReader br = new BufferedReader(new FileReader(OUTPUT_SEARCH)); StringBuffer buf = new StringBuffer(); String line; while (true) { line = br.readLine(); buf.append(line); buf.append(System.lineSeparator()); if (line == null) break; } br.close(); return buf.toString(); } ,或添加一些while loop,或以其他方式执行相同操作,也不会取代...

P.S。:哦,如果我只得到第三行,那么它会向我抛出“)”部分的错误。

0 个答案:

没有答案