使用java(DOM Parser)解析多级XML文件

时间:2016-03-15 06:10:22

标签: java xml xml-parsing domparser

以下是我的XML文件示例:

    ?xml version="1.0"?>
    <?xml-stylesheet type="text/xsl" href="xslt/options.xsl"?>
    <options>
      <version>0001</version>
      <title>ConfigData</title>
      <category>
        <name>GConfigData</name>
        <option>
          <name>String_name</name>
          <value>350.16.01a</value>
          <control>
            <type>TextBox2</type>
            <caption> String Name</caption>
            <left>0</left>
            <top>0</top>
            <width>2600</width>
            <height>900</height>
            <font>Courier</font>
            <scroll_bar>0</scroll_bar>
          </control>
        </option>
        <option>
          <name>FileID</name>
          <value>1601</value>
          <control>
            <type>TextBox2</type>
            <caption>file version</caption>
            <left>0</left>
            <top>900</top>
            <width>2600</width>
            <height>900</height>
            <font>Courier</font>
            <scroll_bar>0</scroll_bar>
          </control>
        </option>
        <option>
          <name>systemID</name>
          <value>0</value>
          <control>
            <type>TextBox2</type>
            <caption>System ID</caption>
            <left>0</left>
            <top>1800</top>
            <width>2400</width>
            <height>900</height>
            <font>Courier</font>
            <scroll_bar>0</scroll_bar>
          </control>
        </option>
        <option>
          <name>SyncTime</name>
          <value>2</value>
          <control>
            <type>TextBox2</type>
            <caption>Sync Time</caption>
            <left>0</left>
            <top>2700</top>
            <width>2400</width>
            <height>900</height>
            <font>Courier</font>
            <scroll_bar>0</scroll_bar>
          </control>
        </option>
        <option>
          <name>UseServer</name>
          <value>0</value>
          <control>
            <type>TextBox2</type>
            <caption>Use Server</caption>
            <left>0</left>
            <top>3600</top>
            <width>2400</width>
            <height>900</height>
            <font>Courier</font>
            <scroll_bar>0</scroll_bar>
          </control>
        </option>
        <option>
          <name>CommType</name>
          <value>0</value>
          <control>
            <type>FixedList</type>
            <caption>Comm Type</caption>
            <left>0</left>
            <top>4500</top>
            <width>2400</width>
            <height>900</height>
            <list>                                              
              <item>
                <text>Parellel</text>
                <value>0</value>
              </item>
              <item>
                <text>Simple Serial</text>
                <value>1</value>
              </item>
              <item>
                <text>Complex Serial</text>
                <value>2</value>
              </item>
            </list>
          </control>
        </option>
        <option>
          <name>YYBasis</name>
          <value>70</value>
          <control>
            <type>TextBox2</type>
            <caption>Set YY Basis</caption>
            <left>0</left>
            <top>5400</top>
            <width>2400</width>
            <height>900</height>
            <font>Courier</font>
            <scroll_bar>0</scroll_bar>
          </control>
        </option>
        <option>
          <name>Separator</name>
          <value>46</value>
          <control>
            <type>TextBox2</type>
            <caption>Separator</caption>
            <left>0</left>
            <top>6300</top>
            <width>2400</width>
            <height>900</height>
            <font>Courier</font>
            <scroll_bar>0</scroll_bar>
          </control>
        </option>
        <option>
          <name>WholeSeparator</name>
          <value>44</value>
          <control>
            <type>TextBox2</type>
            <caption>Whole Separator</caption>
            <left>0</left>
            <top>7200</top>
            <width>2400</width>
            <height>900</height>
            <font>Courier</font>
            <scroll_bar>0</scroll_bar>
          </control>
        </option>
        <option>
          <name>DateFormat</name>
          <value>0</value>
          <control>
            <type>FixedList</type>
            <caption>Date Format</caption>
            <left>2600</left>
            <top>0</top>
            <width>2400</width>
            <height>900</height>
            <list>
              <item>
                <text>MM/DD/YY</text>
                <value>0</value>
              </item>
              <item>
                <text>MM/DD/YYYY</text>
                <value>1</value>
              </item>
              <item>
                <text>DD/MM/YY</text>
                <value>2</value>
              </item>
              <item>
                <text>DD/MM/YYYY</text>
                <value>3</value>
              </item>
              <item>
                <text>YY/MM/DD</text>
                <value>4</value>
              </item>
              <item>
                <text>MM.DD.YY</text>
                <value>6</value>
              </item>
              <item>
                <text>MM.DD.YYYY</text>
                <value>7</value>
              </item>
              <item>
                <text>DD.MM.YY</text>
                <value>8</value>
              </item>
              <item>
                <text>DD.MM.YYYY</text>
                <value>9</value>
              </item>
              <item>
                <text>YY.MM.DD</text>
                <value>10</value>
              </item>
              <item>
                <text>YYYY.MM.DD</text>
                <value>11</value>
              </item>
            </list>
          </control>
        </option>
      </category>
    </options>

我编写了java代码来解析每个选项的名称,标题和值。这是代码:

public class XMLParsingSingleFileFinal {



    public static void main(String[] args) throws ParserConfigurationException, SAXException, IOException
       {
          //Get Document Builder
          DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
          DocumentBuilder builder = factory.newDocumentBuilder();

          //Build Document
          Document document = builder.parse(new File("options.xml"));

          //Normalize the XML Structure; It's just too important !!
          document.getDocumentElement().normalize();
          XPath xPath =  XPathFactory.newInstance().newXPath();

          //Here comes the root node
          Element root = document.getDocumentElement();
          System.out.println(root.getNodeName());

          //Get all options
          NodeList nList = document.getElementsByTagName("options");
          System.out.println("Total Options = " + nList.getLength());
          System.out.println("TITLE = " + document.getElementsByTagName("title").item(0).getTextContent());
          System.out.println("VERSION = " + document.getElementsByTagName("version").item(0).getTextContent());

          System.out.println("===================================");

          //Get all category
          NodeList nList1 = document.getElementsByTagName("category");
          System.out.println("Total Category inside options = " + nList1.getLength());
          //int count1 = nList1.getLength();


          for (int temp = 0; temp < nList1.getLength(); temp++)
          {
             Node node = nList1.item(temp);
             if (node.getNodeType() == Node.ELEMENT_NODE)
             {
                 Element mElement = (Element) node;
                 System.out.println("\nCategory Name = " + mElement.getElementsByTagName("name").item(0).getTextContent());
                 NodeList nList2 = mElement.getElementsByTagName("option");
                 System.out.println("option inside category = " + nList2.getLength());
                 System.out.println("\n\t");
                // int count = nList2.getLength();


                 for (int temp1 = 0; temp1 < nList2.getLength()/2; temp1++) 
                {

                    Node nNode = nList2.item(temp1);
                    if (nNode.getNodeType() == Node.ELEMENT_NODE)
                    {

                    Element nElement = (Element) nNode;

                 System.out.println("\tOption Name = " + mElement.getElementsByTagName("name").item(temp1+1).getTextContent());
                 System.out.println("\t\tCaption Name = " + mElement.getElementsByTagName("caption").item(temp1).getTextContent());

                 System.out.println("\t\tValue = " + mElement.getElementsByTagName("value").item(temp1).getTextContent());



                 System.out.println("\n\t");

            }

              }  
                 System.out.println("\n\t");
             }   
          }   
       }
}

我的主要目的是解析节点“选项”的“值”。

正如您在“选项” - commtype中所看到的那样,属性“item”也具有childnode“value”。

因此,在解析时,直到选项名称“commtype”,它正在生成正确的数据。转到下一个选项,它从前一个选项中获取childnode“item”的“值”。

Example:(Parse Result)

options
Total Options = 1
TITLE = ConfigData
VERSION = 0001
===================================
Total Category inside options = 23

Category Name = GConfigData
option inside category = 38


    Option Name = String_name
        Caption Name = String Name
        Value = 350.16.01a


    Option Name = FileID
        Caption Name =  file version
        Value = 1601


    Option Name = SystemID
        Caption Name = System ID
        Value = 0


    Option Name = SyncTime
        Caption Name = Sync Time
        Value = 2


    Option Name = UseServer
        Caption Name = Use Server
        Value = 0


    Option Name = CommType
        Caption Name = Comm Type
        Value = 0


    Option Name = YYBasis
        Caption Name = Set YY Basis
        Value = 0        /*(Here the value should be 70 as in XML file , But its taking the value of option(Name:CommType)/control/list/item(text:parellel)/value )*/


    Option Name = Separator
        Caption Name =  Separator
        Value = 1       /*(Here the value should be 46 as in XML file , But its taking the value of option(Name:CommType)/control/list/item(text:simple serial)/value)*/


    Option Name =WholeSeparator
        Caption Name = Whole Separator
        Value = 2     /*(Here the value should be 44 as in XML file , But its taking the value of option(Name:CommType)/control/list/item(text:complex serial)/value)*/


    Option Name = DateFormat
        Caption Name = Date Format
        Value = 70    //(Value should be 0)

在选项名称:CommType之后,每个选项的值都被错误地解析。

这有什么解决方案?我是java和XML的新手。

PS:这是我在这个论坛上的第一个问题。我对任何拼写错误以及质疑方式是否错误表示道歉。请尝试以可能的方式帮助我。

2 个答案:

答案 0 :(得分:1)

不要对节点使用索引\偏移量(硬编码反模式),这会使你的代码不敏捷

SAXReader reader = new SAXReader();
Document document = reader.read(file);
List<Node> nodes = document.selectNodes("/options/category/option");

for (Node node : nodes) {
    System.out.println("caption: " + node.selectSingleNode("control/caption").getText());
    System.out.println("value : " + node.selectSingleNode("value").getText());
}

示例输出(cutted):

caption:  String Name
value : 350.16.01a
caption: file version
value : 1601
caption: System ID
value : 0

所需的依赖项:

<dependency>
    <groupId>jaxen</groupId>
    <artifactId>jaxen</artifactId>
    <version>1.1.6</version>
</dependency>

<dependency>
    <groupId>dom4j</groupId>
    <artifactId>dom4j</artifactId>
    <version>1.6.1</version>
</dependency>

答案 1 :(得分:1)

方法node.getElementsByTagName ()搜索node内的所有事件。由于您始终使用“类别”节点作为搜索库而不是选项或项目节点,因此您将获得意外结果。