Question

xml文件内容

<distributionChannels><distributionChannel type="Wap" id="1"><contentChannelRefs>
<contentChannelRef id="2"><categories><category 
link="http://images/11.gif" id="1"><names><name lang="de">Top Downloads</name><name
lang="ww">Tops</name></names></category></categories></contentChannelRef>
</contentChannelRefs></distributionChannel>  
</distributionChannels>

如何删除我从xml文件中读取的不需要的内容，输出应如下所示：

<category link="http://images/11.gif" id="1"><names><name lang="de">Top Downloads</name><name lang="ww">Tops</name></names></category>

Answer 1

可靠的解决方案 - 使用XML解析器。简单的解决方案是

s = s.substring(s.indexOf("<categories>"), s.indexOf("</categories>") + 13);

如果您想逐个阅读类别，请使用正则表达式

    Matcher m = Pattern.compile("<category.*?>.*?</category>").matcher(xml);
    for(int i = 0; m.find(); i++) {
        System.out.println(m.group());
    }

Answer 2

不建议使用XML进行模式匹配。使用解析器来获取节点并相应地管理它们。如果您对打印它们感兴趣，我已经包含了打印节点的代码。

public static void main(String[] args)
        throws ParserConfigurationException, SAXException,
        IOException, XPathExpressionException {
    DocumentBuilderFactory domFactory = DocumentBuilderFactory.newInstance();
    domFactory.setNamespaceAware(true);
    DocumentBuilder builder = domFactory.newDocumentBuilder();
    Document doc = builder.parse(new InputSource(new StringReader(s)));

    XPathFactory factory = XPathFactory.newInstance();
    XPath xpath = factory.newXPath();
    XPathExpression expr
            = xpath.compile("//categories//category");

    Object result = expr.evaluate(doc, XPathConstants.NODESET);
    NodeList nodes = (NodeList) result;
    //This is where you are printing things. You can handle differently if
    //you would like.
    for (int i = 0; i < nodes.getLength(); i++) {
        System.out.println(nodeToString(nodes.item(i)));
    }
}

private static String nodeToString(Node node) {
    StringWriter sw = new StringWriter();
    try {
        Transformer t = TransformerFactory.newInstance().newTransformer();
        t.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
        t.setOutputProperty(OutputKeys.INDENT, "yes");
        t.transform(new DOMSource(node), new StreamResult(sw));
    } catch (TransformerException te) {
        te.printStackTrace();
    }
    return sw.toString();
}

删除xml文件中字符串之前和之后的不需要的字符串

2 个答案: