html-docx转换中的分页符

时间:2016-02-27 12:35:22

标签: html docx4j

我有简单的html(temlate),我用docxj4转换成docx:

<html>

<head>
    <style type="text/css">
    tr,
    h2,
    tnr {
        font-family: Times New Roman;
        font-size: 11pt;
    }

    h2 {
        text-align: center;
    }

    .notesTable {
        border: 4px double black;
        border-collapse: collapse;
        border: 1px solid black;
    }
    </style>
</head>

<body>
    <table align="center" style="width: 75%; margin-left: -25%">
        <tbody>
            <tr style="height: 25px;font-family: 'Times New Roman';font-size: 16pt;">
                <td>28.02.2016 sunday</td>
                <td style="text-align: center; width: 30%;">test</td>
            </tr>
        </tbody>
    </table>
    <div>
        <ol>
            <li>ex1 </li>
            <li>ex2</li>
        </ol>
    </div>
    <p style="text-align: left;">
        <span style="font-family:'Comic Sans MS';">
    test
    </span>
    </p>
    <p>
        <h2>comments</h2> test
    </p>
    <p>
        <h2>contacts</h2> test
    </p>
    <br style="page-break-after: always; clear:both;" />
    <p>
    </p>
</body>

</html>

问题在于

<br style="page-break-after: always; clear:both;" />

如果是这样,结果doc文件没有分页符。当我把它改成

<br style="page-break-after: always; clear:both;">

分页符出现但我得到了异常

  

org.xml.sax.SAXParseException; lineNumber:142; columnNumber:3;元素类型&#34; br&#34;必须由匹配的结束标记终止&#34;&#34;。

并且所有样式都是默认的。 请告诉我我做错了什么?

import org.docx4j.model.structure.PageSizePaper;
import org.docx4j.openpackaging.exceptions.Docx4JException;
import org.docx4j.openpackaging.packages.WordprocessingMLPackage;
import org.docx4j.openpackaging.parts.WordprocessingML.AltChunkType;
import org.docx4j.openpackaging.parts.WordprocessingML.MainDocumentPart;

import java.io.FileNotFoundException;
import java.io.FileOutputStream;

public class App {
    public static void main(String[] args) throws Docx4JException, FileNotFoundException {
        WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage.createPackage(PageSizePaper.A4, false);

        MainDocumentPart mdp = wordMLPackage.getMainDocumentPart();
        String xhtml = "<html>" +
                "<head>" +
                "    <style type=\"text/css\">" +
                "    h2 {" +
                "        text-align: center;" +
                "        font-family: Times New Roman;" +
                "        font-size: 11 pt;" +
                "    }" +
                "    </style>" +
                "</head>" +
                "<body>" +
                "    <h2> Line on the first page</h2>" +
                "    <br style=\"page-break-after: always; clear:both;\" >" +
                "    <h2> Line on the second page</h2>" +
                "</body>" +
                "</html>";
        mdp.addAltChunk(AltChunkType.Xhtml, xhtml.getBytes());
        WordprocessingMLPackage pkgOut = mdp.convertAltChunks();
        FileOutputStream stream1 = new FileOutputStream("test.doc");
        pkgOut.save(stream1);

    }
}

1 个答案:

答案 0 :(得分:1)

我认为你需要自己实现这个功能:
1.从github下载XHTMLImporterImpl.java 2.在方法&#34; processInlineBoxContent&#34;(在&#34; br&#34;条件块中)添加逻辑,如下所示: Br br = Context.getWmlObjectFactory().createBr(); Attr attrNode = s.getElement().getAttributeNode("style"); if (attrNode != null && attrNode.getValue().contains("page-break-after: always")) { br.setType(STBrType.PAGE); }