从xml文档获取html字符串

时间:2012-11-19 09:57:50

标签: java xml xml-parsing

我有以下xml:

<version>
    <name>2.0.2</name>
    <description>
-Stop hsql database after close fist <br />
-Check for null category name before adding it to the categories list  <br />
-Fix NPE bug if there is no updates  <br />
-add default value for variable, change read bytes filter, and description of propertyFile  <br />
-Change HTTP web Proxy (the “qcProxy” field ) to http://web-proxy.isr.hp.com:8080  <br />
</description>
    <fromversion>>=2.0</fromversion>
</version>

我想使用Java返回描述标记字符串内容?

1 个答案:

答案 0 :(得分:2)

这是非常标准的Java XML解析,您可以在互联网上的任何地方找到它,但在标准JDK中使用XPath就是这样。

String xml = "your XML";

// load the XML as String into a DOM Document object
DocumentBuilderFactory docFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = docFactory.newDocumentBuilder();
ByteArrayInputStream bis = new ByteArrayInputStream(xml.getBytes());
Document doc = docBuilder.parse(bis);

// XPath to retrieve the content of the <version>/<description> tag
XPath xpath = XPathFactory.newInstance().newXPath();
XPathExpression expr = xpath.compile("/version/description");
Node description = (Node)expr.evaluate(doc, XPathConstants.NODE);
System.out.println("description: " + description.getTextContent());

修改

由于您的文字内容中包含XML <br/>,因此无法从Node.getTextContent()检索到该Node。一种解决方案是将<description>转换为XML String等效项,剥离根节点String xml = "<version>\r\n" + // " <name>2.0.2</name>\r\n" + // " <description>\r\n" + // "-Stop hsql database after close fist <br />\r\n" + // "-Check for null category name before adding it to the categories list <br />\r\n" + // "-Fix NPE bug if there is no updates <br />\r\n" + // "-add default value for variable, change read bytes filter, and description of propertyFile <br />\r\n" + // "-Change HTTP web Proxy (the “qcProxy” field ) to http://web-proxy.isr.hp.com:8080 <br />\r\n" + // "</description>\r\n" + // " <fromversion>>=2.0</fromversion>\r\n" + // "</version>"; DocumentBuilderFactory docFactory = DocumentBuilderFactory.newInstance(); DocumentBuilder docBuilder = docFactory.newDocumentBuilder(); ByteArrayInputStream bis = new ByteArrayInputStream(xml.getBytes()); Document doc = docBuilder.parse(bis); // XPath to retrieve the <version>/<description> tag XPath xpath = XPathFactory.newInstance().newXPath(); XPathExpression expr = xpath.compile("/version/description"); Node descriptionNode = (Node) expr.evaluate(doc, XPathConstants.NODE); // Transformer to convert the XML Node to String equivalent Transformer transformer = TransformerFactory.newInstance().newTransformer(); transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes"); StringWriter sw = new StringWriter(); transformer.transform(new DOMSource(descriptionNode), new StreamResult(sw)); String description = sw.getBuffer().toString().replaceAll("</?description>", ""); System.out.println(description);

这是一个完整的例子:

-Stop hsql database after close fist <br/>
-Check for null category name before adding it to the categories list  <br/>
-Fix NPE bug if there is no updates  <br/>
-add default value for variable, change read bytes filter, and description of propertyFile  <br/>
-Change HTTP web Proxy (the “qcProxy” field ) to http://web-proxy.isr.hp.com:8080  <br/>

打印:

// XPath to retrieve the content of the <version>/<description> tag
XPath xpath = XPathFactory.newInstance().newXPath();
XPathExpression expr = xpath.compile("//description");
NodeList descriptionNode = (NodeList) expr.evaluate(doc, XPathConstants.NODESET);

List<String> descriptions = new ArrayList<String>(); // hold all the descriptions as String
Transformer transformer = TransformerFactory.newInstance().newTransformer();
transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
for (int i = 0; i < descriptionNode.getLength(); ++i) {
    Node descr = descriptionNode.item(i);
    StringWriter sw = new StringWriter();
    transformer.transform(new DOMSource(descr), new StreamResult(sw));
    String description = sw.getBuffer().toString().replaceAll("</?description>", "");
    descriptions.add(description);
}
// here you can do what you want with the List of Strings `description`

修改2

为了拥有它们你需要得到不同节点的NODESET并迭代它以完成与上面相同的操作。

{{1}}