如何使用java将Word DOCX转换为HTML

时间:2016-07-01 06:53:04

标签: java html apache-poi converter docx

我使用以下代码:
我的代码仅将文档文档转换为 HTML 。我需要将 Docx 文档转换为 HTML

try
{
    HWPFDocumentCore wordDocument = WordToHtmlUtils.loadDoc(new FileInputStream("C:\\DOC.doc"));

    WordToHtmlConverter wordToHtmlConverter = new WordToHtmlConverter(
            DocumentBuilderFactory.newInstance().newDocumentBuilder().newDocument());
    wordToHtmlConverter.processDocument(wordDocument);

    org.w3c.dom.Document htmlDocument = wordToHtmlConverter.getDocument();
    ByteArrayOutputStream out = new ByteArrayOutputStream();
    DOMSource domSource = new DOMSource(htmlDocument);
    StreamResult streamResult = new StreamResult(out);

    TransformerFactory tf = TransformerFactory.newInstance();
    Transformer serializer = tf.newTransformer();
    serializer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
    serializer.setOutputProperty(OutputKeys.INDENT, "yes");
    serializer.setOutputProperty(OutputKeys.METHOD, "html");
    serializer.transform(domSource, streamResult);
    out.close();

    String result = new String(out.toByteArray());
    System.out.println(result);

    ConvertDocxBigToXHTML html = new ConvertDocxBigToXHTML();
    html.creatHTML(result);
}

catch(Exception e)
{
    e.printStackTrace();
}

有人可以帮助我在此代码之上进行哪些更改

0 个答案:

没有答案