apache-poi WordToHtmlConverter无法正确转换图像

时间:2016-04-07 04:39:33

标签: java apache-poi

我正在使用apache.poi将word文件转换为html。我的文档有文本和图像 - 文本可以很好地转换为HTML,但图像不会被转换。有没有办法转换图像呢?

HWPFDocumentCore wordDocument = WordToHtmlUtils.loadDoc(new FileInputStream("my_document_path"));

WordToHtmlConverter wordToHtmlConverter = new WordToHtmlConverter(DocumentBuilderFactory.newInstance().newDocumentBuilder().newDocument());
wordToHtmlConverter.processDocument(wordDocument);

org.w3c.dom.Document htmlDocument = wordToHtmlConverter.getDocument();
ByteArrayOutputStream out = new ByteArrayOutputStream();
DOMSource domSource = new DOMSource(htmlDocument);
StreamResult streamResult = new StreamResult(out);

TransformerFactory tf = TransformerFactory.newInstance();
Transformer serializer = tf.newTransformer();
serializer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
serializer.setOutputProperty(OutputKeys.MEDIA_TYPE,"text/image" );
serializer.setOutputProperty(OutputKeys.INDENT, "yes");
serializer.setOutputProperty(OutputKeys.METHOD, "html");
serializer.transform(domSource, streamResult);
out.close();

String result = new String(out.toByteArray());
File filename = new File("stored_html_path");
FileWriter fw = new FileWriter(filename); //the true will append the new data
fw.write(result);//appends the string to the file
fw.close();

0 个答案:

没有答案