Question

我正在尝试使用 Jsoup 解析来自URL的XML。

在这个给定的XML中，有一些带有命名空间的节点。

代表：<wsdl:types>

现在我希望将包含文本的所有节点都作为“类型”，但可以有任何名称空间。

我能够使用表达式"wsdl|types"来获取此节点。

但是如何将包含文本的所有节点作为具有任何命名空间的“类型”。？

我尝试将表达式设为"*|types"，但它没有用。

请帮忙。

Answer 1

还没有这样的选择器。但是你可以使用一种解决方法 - 不像选择器那样容易阅读，但它是一种解决方案。

/*
 * Connect to the url and parse the document; a XML Parser is used
 * instead of the default one (html)
 */
final String url = "http://www.consultacpf.com/webservices/producao/cdc.asmx?wsdl";
Document doc = Jsoup.connect(url).parser(Parser.xmlParser()).get();


// Elements of any tag, but with 'types' are stored here
Elements withTypes = new Elements();

// Select all elements
for( Element element : doc.select("*") )
{
    // Split the tag by ':'
    final String s[] = element.tagName().split(":");

    /*
     * If there's a namespace (otherwise s.length == 1) use the 2nd
     * part and check if the element has 'types'
     */
    if( s.length > 1 && s[1].equals("types") == true )   
    {
        // Add this element to the found elements list
        withTypes.add(element);
    }
}

您可以将此代码的基本部分放入方法中，因此您可以得到以下内容：

Elements nsSelect(Document doc, String value)
{
    // Code from above
}

...

Elements withTypes = nsSelect(doc, "types");

使用jsoup解析具有任何命名空间的文本的xml节点

1 个答案: