Question

我想将docx转换为xhtml，然后转换为docx。

我已经看到了docx4j-ImportXHTML项目，它似乎很适合我的需求。

我已经做到了：

public static void convertFragment(Long idConsultant, String fileName){
        WordprocessingMLPackage wordMLPackage = null;
        try {
            wordMLPackage = WordprocessingMLPackage.load(new File(fileName));

            // XHTML export
            AbstractHtmlExporter exporter = new HtmlExporterNG2();
            AbstractHtmlExporter.HtmlSettings htmlSettings = new AbstractHtmlExporter.HtmlSettings();

            htmlSettings.setWmlPackage(wordMLPackage);

            boolean nestLists = false;
            if (nestLists) {
                SdtWriter.registerTagHandler("HTML_ELEMENT", new SdtToListSdtTagHandler());
            } else {
                htmlSettings.getFeatures().remove(ConversionFeatures.PP_HTML_COLLECT_LISTS);
            } // must do one or the other

            String htmlFilePath = fileName+"_"+idConsultant+".html";
            OutputStream os = null;

            os = new FileOutputStream(htmlFilePath);
            Docx4J.toHTML(htmlSettings, os, Docx4J.FLAG_NONE);
            FileInputStream fis = null;
            fis = new FileInputStream(fileName+"_"+idConsultant+".html");
            String stringFromFile= IOUtils.toString(fis).replaceAll("&lt;","<").replaceAll("&gt;",">").replaceAll("<br>","<br></br>").replaceAll("<p></p>","<br></br>").replaceAll("&nbsp;", "\u00a0");
            wordMLPackage = WordprocessingMLPackage.createPackage();
            XHTMLImporterImpl XHTMLImporter = new XHTMLImporterImpl(wordMLPackage);
            wordMLPackage.getMainDocumentPart().getContent().addAll(XHTMLImporter.convert( stringFromFile, null) );

            System.out.println(
            XmlUtils.marshaltoString(wordMLPackage.getMainDocumentPart().getJaxbElement(), true, true));


            wordMLPackage.save(new File(fileName) );
        } catch (Docx4JException e) {
            e.printStackTrace();
        } catch (FileNotFoundException e) {
            e.printStackTrace();
        }catch (IOException e) {
            e.printStackTrace();
        }
}

我的问题是结果文档的边距与原始文档不同，而我正在覆盖原始文档。边距来自

到

有没有办法使边距保持一致？
谢谢。

docx4j-ImportXHTML-转换后的页面格式问题

0 个答案: