我正在使用 ApachePOI库在.docm
文件中搜索和替换。
我有.txt
从PDF文件(与.docm
文件相同的内容)的每个页面中提取了第一个 3个字。
我希望从.txt
获取每一行,并在匹配时在.docm
内搜索,在该特定3个字之前,我将添加html
页面标记,如{{1} }
EDITED(30.03.2017)我的<page>(pdfpagenumber)</page>
看起来像:
searchedTemp.txt
EDIT(30.03.2017)APACHE POI中的代码如下所示:
Dieses Buch wendet
Einführung in
b) mit ihm
Straftäter werden
etc.
即使我删除package dgi.writetags;
import org.apache.poi.openxml4j.exceptions.InvalidFormatException;
import org.apache.poi.xwpf.usermodel.XWPFDocument;
import org.apache.poi.xwpf.usermodel.XWPFParagraph;
import org.apache.poi.xwpf.usermodel.XWPFRun;
import java.io.BufferedReader;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.FileReader;
import java.io.IOException;
import java.util.Scanner;
public class App {
public static String SOURCE_FILE = "D:" + File.separator + "ReadPageTags" + File.separator + "Doc" + File.separator + "test.docm";
public static String OUTPUT_FILE = "D:" + File.separator + "ReadPageTags" + File.separator + "Doc" + File.separator + "output.docm";
public static String OUTPUT_SEARCH = "D:" + File.separator + "ReadPageTags" + File.separator + "GeneratedList" + File.separator + "searchTemp.txt";
public static String actual;
//MAIN METHOD
public static void main(String[] args) throws Exception {
//Application instances
App docm = new App();
System.out.println("[DOCM] App started.");
//Search inside the TXT
//docm.searchlist();
//CheckStart
//System.out.print(docm.searchlist());
//CheckEnd
//Replace text in DOCM
docm.replaceText();
//Exit after succeed
//System.exit(0);
}
//replace text method
private void replaceText() throws InvalidFormatException, IOException{
App wt = new App();
String[] lines = wt.searchlist().split("\n");
XWPFDocument doc = new XWPFDocument(new FileInputStream(SOURCE_FILE));
for (XWPFParagraph p : doc.getParagraphs()){
for (XWPFRun r : p.getRuns()){
String text = r.toString();
for(String s : lines){
text = text.replace(s, "pagetag "+s);
r.setText(text,0);
}
}
System.out.println("[DOCM] Working...");
}
doc.write(new FileOutputStream(OUTPUT_FILE));
doc.close();
System.out.println("[DOCM] Finished");
}
//search into text file method
//for the latest method return STRING
private String searchlist() throws IOException, InvalidFormatException{
BufferedReader br = new BufferedReader(new FileReader(OUTPUT_SEARCH));
StringBuffer buf = new StringBuffer();
String line;
while (true) {
line = br.readLine();
buf.append(line);
buf.append(System.lineSeparator());
if (line == null) break;
}
br.close();
return buf.toString();
}
,或添加一些while loop
,或以其他方式执行相同操作,也不会取代...
P.S。:哦,如果我只得到第三行,那么它会向我抛出“)”部分的错误。