我想将docx转换为xhtml,然后转换为docx。
我已经看到了docx4j-ImportXHTML项目,它似乎很适合我的需求。
我已经做到了:
public static void convertFragment(Long idConsultant, String fileName){
WordprocessingMLPackage wordMLPackage = null;
try {
wordMLPackage = WordprocessingMLPackage.load(new File(fileName));
// XHTML export
AbstractHtmlExporter exporter = new HtmlExporterNG2();
AbstractHtmlExporter.HtmlSettings htmlSettings = new AbstractHtmlExporter.HtmlSettings();
htmlSettings.setWmlPackage(wordMLPackage);
boolean nestLists = false;
if (nestLists) {
SdtWriter.registerTagHandler("HTML_ELEMENT", new SdtToListSdtTagHandler());
} else {
htmlSettings.getFeatures().remove(ConversionFeatures.PP_HTML_COLLECT_LISTS);
} // must do one or the other
String htmlFilePath = fileName+"_"+idConsultant+".html";
OutputStream os = null;
os = new FileOutputStream(htmlFilePath);
Docx4J.toHTML(htmlSettings, os, Docx4J.FLAG_NONE);
FileInputStream fis = null;
fis = new FileInputStream(fileName+"_"+idConsultant+".html");
String stringFromFile= IOUtils.toString(fis).replaceAll("<","<").replaceAll(">",">").replaceAll("<br>","<br></br>").replaceAll("<p></p>","<br></br>").replaceAll(" ", "\u00a0");
wordMLPackage = WordprocessingMLPackage.createPackage();
XHTMLImporterImpl XHTMLImporter = new XHTMLImporterImpl(wordMLPackage);
wordMLPackage.getMainDocumentPart().getContent().addAll(XHTMLImporter.convert( stringFromFile, null) );
System.out.println(
XmlUtils.marshaltoString(wordMLPackage.getMainDocumentPart().getJaxbElement(), true, true));
wordMLPackage.save(new File(fileName) );
} catch (Docx4JException e) {
e.printStackTrace();
} catch (FileNotFoundException e) {
e.printStackTrace();
}catch (IOException e) {
e.printStackTrace();
}
}
我的问题是结果文档的边距与原始文档不同,而我正在覆盖原始文档。 边距来自
到
有没有办法使边距保持一致?
谢谢。