如何从字符串格式的xml中获取重复的标记值(Java)

时间:2014-12-05 14:48:57

标签: java xml string

我有这种格式的xml:

<Container1>
<Description>one</Description>
</Container1>
<Container2>
<Description>Two</Description>
</Container2>

将此xml读取为String。(我无法直接解析xml) 现在从该String我需要将所有Description标签的值带到List 关于如何做到这一点的任何线索?

1 个答案:

答案 0 :(得分:0)

由于您正在接收xml,因此您可以解析它并使用XPath来提取必要的信息。请注意,您需要将有效的xml传递给builder#parse(我将xml包装在根元素中)。

import org.w3c.dom.Document;
import org.w3c.dom.NodeList;
import org.xml.sax.SAXException;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathExpression;
import javax.xml.xpath.XPathExpressionException;
import javax.xml.xpath.XPathFactory;
import java.io.ByteArrayInputStream;
import java.io.IOException;
import java.nio.charset.StandardCharsets;

public class XPathExtractor {

    public static void main(String[] args) throws XPathExpressionException, IOException, SAXException, ParserConfigurationException {
        String s = "<root>\n" +
                "<Container1>\n" +
                "<Description>one</Description>\n" +
                "</Container1>\n" +
                "<Container2>\n" +
                "<Description>Two</Description>\n" +
                "</Container2>\n" +
                "</root>";
        DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
        factory.setNamespaceAware(true);
        DocumentBuilder builder = factory.newDocumentBuilder();
        Document doc = builder.parse(new ByteArrayInputStream(s.getBytes(StandardCharsets.UTF_8)));
        XPath xpath = XPathFactory.newInstance().newXPath();
        XPathExpression descriptionExpr = xpath.compile("//Description/text()");
        Object result = descriptionExpr.evaluate(doc, XPathConstants.NODESET);
        NodeList nodes = (NodeList) result;
        for (int i = 0; i < nodes.getLength(); i++) {
            System.out.println(nodes.item(i).getNodeValue());
        }
    }
}

输出

one
Two