使用Java Dom Parser解析xml

时间:2016-03-21 09:05:52

标签: java xml

我是Java和XML的新手,我需要从xml文件中获取一些数据。

这是我的xml

IsBodyHtml

假设我想获得学期1年1的所有模块代码。

BodyEncoding

到目前为止,这是我的代码

<?xml version="1.0" encoding="UTF-8"?>
<course name="BSc (Hons) Software Engineering" version="5.0" type="FT" lowerbound="2012" upperbound="2014" >
   <year id="1">
      <semester id="1">
         <module>
            <code>HCA1105C</code>
            <name>Computer Architecture</name>
            <credits>4</credits>
            <hrs_per_wk>2+2</hrs_per_wk>
         </module>
         <module>
            <code>PROG1115C</code>
            <name>Object Oriented Software Development I</name>
            <credits>4</credits>
            <hrs_per_wk>2+2</hrs_per_wk>
         </module>
         <module>
            <code>MATH1103C</code>
            <name>Decision Mathematics</name>
            <credits>3</credits>
            <hrs_per_wk>2+1</hrs_per_wk>
         </module>
         <module>
            <code>ITE1107C</code>
            <name>Language and Communication Seminar</name>
            <credits>3</credits>
            <hrs_per_wk>2+1</hrs_per_wk>
         </module>
         <module>
            <code>MGMT1101C</code>
            <name>Management Seminar</name>
            <credits>3</credits>
            <hrs_per_wk>2+1</hrs_per_wk>
         </module>
      </semester>
      <semester id="2">
         <module>
            <code>PROG1116C</code>
            <name>Object Oriented Software Development II</name>
            <credits>4</credits>
            <hrs_per_wk>2+2</hrs_per_wk>
         </module>
         <module>
            <code>WAT1116C</code>
            <name>Internet Programming I</name>
            <credits>4</credits>
            <hrs_per_wk>2+2</hrs_per_wk>
         </module>
         <module>
            <code>MATH1101C</code>
            <name>Analytic Methods for Computing</name>
            <credits>4</credits>
            <hrs_per_wk>2+2</hrs_per_wk>
         </module>
         <module>
            <code>DBT1111C</code>
            <name>Database Design</name>
            <credits>4</credits>
            <hrs_per_wk>2+2</hrs_per_wk>
         </module>
      </semester>
   </year>
   <year id="2">
      <semester id="1">
         <module>
            <code>CAN2112C</code>
            <name>Network Design &amp; Programming</name>
            <credits>4</credits>
            <hrs_per_wk>2+2</hrs_per_wk>
         </module>
         <module>
            <code>WAT2117C</code>
            <name>Internet Programming II</name>
            <credits>4</credits>
            <hrs_per_wk>2+2</hrs_per_wk>
         </module>
         <module>
            <code>OSS2109C</code>
            <name>Operating Systems</name>
            <credits>4</credits>
            <hrs_per_wk>2+2</hrs_per_wk>
         </module>
         <module>
            <code>PROG2117C</code>
            <name>Desktop Application Development</name>
            <credits>4</credits>
            <hrs_per_wk>2+2</hrs_per_wk>
         </module>
      </semester>
      <semester id="2">
         <module>
            <code>SDT2114C</code>
            <name>Requirements Engineering</name>
            <credits>4</credits>
            <hrs_per_wk>2+2</hrs_per_wk>
         </module>
         <module>
            <code>MATH2323C</code>
            <name>Numerical Methods</name>
            <credits>4</credits>
            <hrs_per_wk>2+2</hrs_per_wk>
         </module>
         <module>
            <code>MCT2104C</code>
            <name>Mobile Application Development</name>
            <credits>4</credits>
            <hrs_per_wk>2+2</hrs_per_wk>
         </module>
         <module>
            <code>MCT2104C</code>
            <name>Mobile Application Development</name>
            <credits>4</credits>
            <hrs_per_wk>2+2</hrs_per_wk>
         </module>
         <module>
            <code>WAT2124C</code>
            <name>Web Services</name>
            <credits>4</credits>
            <hrs_per_wk>2+2</hrs_per_wk>
         </module>
         <module>
            <code>MGMT2104C</code>
            <name>Research &amp; Development Seminar</name>
            <credits>3</credits>
            <hrs_per_wk>2+1</hrs_per_wk>
         </module>
      </semester>
   </year>
   <year id="3">
      <semester id="1">
         <module>
            <code>SECU3119C</code>
            <name>Secure Software Development</name>
            <credits>4</credits>
            <hrs_per_wk>2+2</hrs_per_wk>
         </module>
         <module>
            <code>MULT3114C</code>
            <name>Game Development</name>
            <credits>4</credits>
            <hrs_per_wk>2+2</hrs_per_wk>
         </module>
         <module>
            <code>SEM3112C</code>
            <name>Project Management Seminar</name>
            <credits>3</credits>
            <hrs_per_wk>2+1</hrs_per_wk>
         </module>
      </semester>
      <semester id="2">
         <module>
            <code>SDT3104C</code>
            <name>Enterprise Software Development</name>
            <credits>4</credits>
            <hrs_per_wk>2+2</hrs_per_wk>
         </module>
         <module>
            <code>WAT3125C</code>
            <name>Emerging Web Technologies</name>
            <credits>4</credits>
            <hrs_per_wk>2+2</hrs_per_wk>
         </module>
         <module>
            <code>SEM3113C</code>
            <name>Software Quality Management</name>
            <credits>4</credits>
            <hrs_per_wk>2+2</hrs_per_wk>
         </module>
         <module>
            <code>MGMT3105C</code>
            <name>Entrepreneurship Seminar</name>
            <credits>3</credits>
            <hrs_per_wk>2+1</hrs_per_wk>
         </module>
         <module>
            <code>PROJ3105C</code>
            <name>Systems Development Project</name>
            <credits>9</credits>
            <hrs_per_wk />
         </module>
      </semester>
   </year>
</course>

我得到以下输出

HCA1105C
PROG1115C
MATH1103C
ITE1107C
MGMT1101C

3 个答案:

答案 0 :(得分:2)

您的代码正在阅读每年的第一个模块。这是因为,对于您指定的条件,节点列表将具有3个节点(年= 1,年= 2,年= 3)。

如果要阅读第1年的所有模块,则需要使用year =“1”递归到文档的子部分。然后你会得到学期的节点列表。您需要进一步递归到学期= 1的孩子。

您可以尝试使用xpath查询,您可以直接获取year = 1和semester = 1的模块。

http://viralpatel.net/blogs/java-xml-xpath-tutorial-parse-xml/

使用XPath修改代码

已编辑

try {   
    File inputFile = new File("courses.xml");
        DocumentBuilderFactory dbFactory
                = DocumentBuilderFactory.newInstance();
        DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
        Document doc = dBuilder.parse(inputFile);
        doc.getDocumentElement().normalize();

        XPath xPath =  XPathFactory.newInstance().newXPath();
        String expression = "/course/year[@id=1]/semester[@id=1]/module/code";
        NodeList nodeList = (NodeList) xPath.compile(expression).evaluate(doc, XPathConstants.NODESET);
        System.out.println(expression);
        for (int i = 0; i < nodeList.getLength(); i++) {
            System.out.println(nodeList.item(i).getTextContent()); 
        }
    } catch (Exception e) {
        JOptionPane.showMessageDialog(null, e.getMessage(), "Fatal Error", JOptionPane.ERROR_MESSAGE);
        System.exit(1);
    }

答案 1 :(得分:1)

检查子节点并深入了解模块将得到您的预期结果,如下所示;

public static void main(String[] args) {
        try {
            File inputFile = new File("Snippet.xml");
            DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
            DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
            Document doc = dBuilder.parse(inputFile);
            doc.getDocumentElement().normalize();
            NodeList nList = doc.getElementsByTagName("year");
            for (int i = 0; i < nList.getLength(); i++) {
                Node nNode = nList.item(i);
                if (nNode.getNodeType() == Node.ELEMENT_NODE) {
                    Element eElement = (Element) nNode;
                    if (Integer.parseInt(eElement.getAttribute("id")) == 1) { // Found year 1
                        NodeList semeterList = nNode.getChildNodes();
                        for (int j = 0; j < semeterList.getLength(); j++) {
                            nNode = semeterList.item(j);
                            if (nNode.getNodeType() == Node.ELEMENT_NODE) {
                                Element semesterNode = (Element) nNode;
                                if (Integer.parseInt(semesterNode.getAttribute("id")) == 1) { //Found semester 1
                                    NodeList moduleList = semesterNode.getChildNodes();
                                    for (int k = 0; k < moduleList.getLength(); k++) {
                                        nNode = moduleList.item(k);
                                        if (nNode.getNodeType() == Node.ELEMENT_NODE) {
                                            Element modeluNode = (Element) nNode;
                                            System.out.println(modeluNode.getElementsByTagName("code").item(0).getTextContent());
                                        }
                                    }
                                }
                            }
                        }
                    }
                }
            }
        } catch (Exception e) {
            e.printStackTrace();
            JOptionPane.showMessageDialog(null, e.getMessage(), "Fatal Error", JOptionPane.ERROR_MESSAGE);
            System.exit(1);
        }
    }

答案 2 :(得分:0)

我们可以通过使用以下代码获取所有代码:

try {   
        File inputFile = new File("src/resources/res.xml");
        DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
        DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
        Document doc = dBuilder.parse(inputFile);
        doc.getDocumentElement().normalize();
        NodeList nList = doc.getElementsByTagName("module");
        for (int i = 0; i < nList.getLength(); i++) {
             Node nNode = nList.item(i);
             if (nNode.getNodeType() == Node.ELEMENT_NODE) {
                 Element eElement = (Element) nNode;
                 System.out.println(eElement.getElementsByTagName("code").item(0).getTextContent());
             }
        }
     } catch (Exception e) {
            JOptionPane.showMessageDialog(null, e.getMessage(), "Fatal Error", JOptionPane.ERROR_MESSAGE);
            System.exit(1);
      }

我们也可以通过每年循环来获取代码 - &gt;学期 - &gt;模块,然后获取属性代码。上面的代码给出了以下结果:

HCA1105C PROG1115C MATH1103C ITE1107C MGMT1101C PROG1116C WAT1116C MATH1101C DBT1111C CAN2112C WAT2117C OSS2109C PROG2117C SDT2114C MATH2323C MCT2104C MCT2104C WAT2124C MGMT2104C SECU3119C MULT3114C SEM3112C SDT3104C WAT3125C SEM3113C MGMT3105C PROJ3105C

相关问题