XML to Java解析器:如何解析CDATA标记中显示的属性

时间:2013-12-05 09:27:10

标签: java xml xml-parsing cdata domparser

我目前正在从HP Quality Center SQL数据库中提取一些数据,并且我需要配置其他数据的正确表示所需的一些数据以XML格式存储。我对XML有基本的了解,并且能够解析大多数属性,并将它们变成运行时对象,这些对象包含进一步数据检索所必需的字段。但是我无法在一个区域内提取属性。由于要搜索哪些表以及要应用哪些过滤器的重要信息,内部数据是在运行时以编程方式处理的必要条件。

我有一个类runnable示例,它只为我读入java对象的每个字段提供了printline输出,并且在我尝试提取CDATA属性时它就会失败。

我已经阅读了很多关于CDATA是什么的文章,但是他们似乎都没有提到类似的设置,其中CDATA部分的内部显然包含属性。

那么,是否有可能以与提取其他属性的方式类似的方式提取这些属性?如果是这样,怎么样?

提前致谢。

代码(xml-string是数据库中的硬编码示例):

import java.io.ByteArrayInputStream;
import java.io.IOException;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;

import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.NodeList;
import org.xml.sax.SAXException;


public class XMLParser {

    public static void main(String[] args){
        String xml = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>" +
                "<AnalysisDefinition Version=\"2.0\" " +
                    "GraphProviderId=\"QC.Graph.Provider\" " +
                    "GroupByField=\"TC_STATUS\" " +
                    "ForceRefresh=\"False\" " +
                    "SelectedProjects=\"CURRENT-PROJECT-UID\" " +
                    "SumOfField=\"\" TimeResolution=\"Day\" " +
                    "DisplayOptions=\"Regular\">" +

                    "<Filter " +
                        "FilterState=\"Custom\" " +
                        "FilterFormat=\"Frec\">" +

                        "<![CDATA[[Filter]{" +
                            "TableName:TESTCYCL," +
                            "ColumnName:TC_ASSIGN_RCYC," +
                            "LogicalFilter:\\00000047\\^URLAnonymized^," +
                            "VisualFilter:\\00000047\\^URLAnonymized^," +
                            "NO_CASE:" +
                            "}" +
                            "]]>" +
                        "</Filter>" +

                        "<DateRange " +
                            "PeriodType=\"Custom\" " +
                            "StartDate=\"2013,9,29\" " +
                            "EndDate=\"2013,10,14\" " +
                        "/>" +
                    "</AnalysisDefinition>";

        AnalysisDefinition ad = createFilterData(xml);      

        System.out.println("displayOtions: " + ad.getDisplayOptions());
        System.out.println("graphProviderID: " + ad.getGraphProviderId());
        System.out.println("GroupByField: " + ad.getGroupByField());
        System.out.println("SumOfField: " + ad.getSumOfField());
        System.out.println("TimeResolution: " + ad.getTimeResolution());
        System.out.println("Version: " + ad.getVersion());

        System.out.println("Filter: " + ad.getFilter());
        System.out.println("DateRange: " + ad.getDateRange());

        System.out.println("FilterState: " + ad.getFilter().getFilterState());
        System.out.println("FilterFormat: " + ad.getFilter().getFilterFormat());
        System.out.println("TableName: " + ad.getFilter().getTableName());


    }

    public static AnalysisDefinition createFilterData(String xml){

        AnalysisDefinition ad = new AnalysisDefinition();

        DocumentBuilderFactory docFactory = DocumentBuilderFactory.newInstance();
        docFactory.setNamespaceAware(true);
        docFactory.setValidating(false);
        docFactory.setIgnoringElementContentWhitespace(true);
        Document doc = null;
        try {
            DocumentBuilder docBuilder = docFactory.newDocumentBuilder();
            ByteArrayInputStream is = new ByteArrayInputStream(xml.getBytes());
            doc = docBuilder.parse(is);

        } catch (ParserConfigurationException e) {
            e.printStackTrace();
        } catch (SAXException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        } catch (IOException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }

        NodeList nl = doc.getElementsByTagName("AnalysisDefinition");
        for(int i = 0, stop = nl.getLength(); i < stop; i++){
            Element e = (Element) nl.item(i);
            ad.setVersion(e.getAttribute("Version"));
            ad.setGraphProviderId(e.getAttribute("GraphProviderId"));
            ad.setGroupByField(e.getAttribute("GroupByField"));
            ad.setForceRefresh(Boolean.parseBoolean(e.getAttribute("ForceRefresh")));
            ad.setSumOfField(e.getAttribute("SumOfField"));
            ad.setTimeResolution(e.getAttribute("TimeResolution"));
            ad.setDisplayOptions(e.getAttribute("DisplayOptions"));
        }

        nl = doc.getElementsByTagName("Filter");
        for(int i = 0, stop = nl.getLength(); i < stop; i++){
            Element e = (Element) nl.item(i);
            Filter filter = new Filter();
            filter.setFilterState(e.getAttribute("FilterState"));
            filter.setFilterFormat(e.getAttribute("FilterFormat"));
            filter.setTableName(e.getAttribute("TableName"));

            ad.setFilter(filter);
        }   
        return ad;
    }
}

1 个答案:

答案 0 :(得分:0)

CDATA表示“字符数据”,即没有标记的文本。因此,您的CDATA中没有属性;只有在您选择时可以解释为属性的文本。通过将它们包装在CDATA中,您已经指示XML解析器不以任何方式解释它们。如果您确实知道CDATA部分中的数据语法,无论是XML还是其他类似JSON,您都必须将CDATA中的文本传递给适当的解析器以提取结构。