我使用了apache poi lib但是用样式类名解析了它。我需要没有类的android。 这是我的代码:
HWPFDocumentCore wordDocument = WordToHtmlUtils.loadDoc(input);
WordToHtmlConverter wordToHtmlConverter = null;
try {
wordToHtmlConverter = new WordToHtmlConverter(DocumentBuilderFactory.newInstance().newDocumentBuilder().newDocument());
wordToHtmlConverter.processDocument(wordDocument);
org.w3c.dom.Document htmlDocument = wordToHtmlConverter.getDocument();
ByteArrayOutputStream out = new ByteArrayOutputStream();
DOMSource domSource = new DOMSource(htmlDocument);
StreamResult streamResult = new StreamResult(out);
TransformerFactory tf = TransformerFactory.newInstance();
Transformer serializer = tf.newTransformer();
serializer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
serializer.setOutputProperty(OutputKeys.INDENT, "yes");
serializer.setOutputProperty(OutputKeys.METHOD, "html");
serializer.transform(domSource, streamResult);
out.close();
String result = new String(out.toByteArray());
System.err.println(result);
} catch (ParserConfigurationException e) {
e.printStackTrace();
} catch (TransformerConfigurationException e) {
e.printStackTrace(); //To change body of catch statement use File | Settings | File Templates.
} catch (TransformerException e) {
e.printStackTrace(); //To change body of catch statement use File | Settings | File Templates.
}`
结果:
<html><head>
<style type="text/css">.b1{white-space-collapsing:preserve;}
.b2{margin: 0.90555555in 0.66944444in 0.66944444in 0.66944444in;}
.s1{text-transform:uppercase;color:black;}
.s2{font-weight:bold;text-transform:uppercase;color:black;}
.s3{color:black;}
.s4{font-style:italic;color:black;}
.s5{font-weight:bold;color:black;}
.s6{font-weight:bold;font-style:italic;text-transform:uppercase;color:black;}
.s7{font-weight:bold;color:maroon;}....</style>
</head>
<body>
<p class="p2"></p>
<p class="p2">
<span class="s2">...</span>
</p>
<p class="p2">
<br>
...</body>
</html>
建议不使用类名进行解析。我需要机器人的标准标签而不是类。
答案 0 :(得分:3)
您可以使用以下链接:http://www.textfixer.com/html/convert-word-to-html.php 我们找到了其他应用程序,因此它们是共享软件
答案 1 :(得分:2)
解析生成的HTML并修改它以适应。