如何使用JAVA DOM从嵌套XML中提取数据?

时间:2014-03-06 21:51:41

标签: java xml dom

我有一个有多个的xml文档。我能够获得和帐户的详细信息(等等。我很难得到像card_type,年,月,first_six等的东西。

此文档中有200个事务,因此循环。

  <transaction href="https://test.com" type="cc">
    <source>subscription</source>
    <created_at type="datetime">2014-03-06T20:59:03Z</created_at>
    <details>
      <account>
        <account_code>234234234</account_code>
        <first_name>asdadad</first_name>
        <last_name>asdadasd3433</last_name>
        <company nil="nil"></company>
        <email>test@gmail.com</email>
        <billing_info type="credit_card">
          <first_name>asdasdasd</first_name>
          <last_name>asdasdasd23434</last_name>
          <address1 nil="nil"></address1>
          <address2 nil="nil"></address2>
          <city nil="nil"></city>
          <state nil="nil"></state>
          <zip nil="nil"></zip>
          <country nil="nil"></country>
          <phone nil="nil"></phone>
          <vat_number nil="nil"></vat_number>
          <card_type>Visa</card_type>
          <year type="integer">2039</year>
          <month type="integer">6</month>
          <first_six>111111</first_six>
          <last_four>9999</last_four>
        </billing_info>
      </account>
    </details>
    <a name="refund" href="https://test.com/refund" method="delete"/>
  </transaction>

我在尝试使用代码时遇到此错误:

java.lang.NullPointerException
        at test.test.getTransactions(test.java:288)
        at test.test.main(test.java:53)

以下是我正在尝试的内容:

try {
  NodeList nList2 = eElement.getElementsByTagName("details");
  Node nNode2 = nList2.item(0);
  Element eElement2 = (Element) nNode2;

  //get some other info in try catch blocks here (removed for reading)

  try {
    System.out.println("attempting billing info");
    NodeList nList3 = eElement2.getElementsByTagName("billing_info");
    Node nNode3 = nList3.item(0);
    Element eElement3 = (Element) nNode3;    
    System.out.println("attempting credit_year");
    System.out.println("credit_year: " + eElement3.getElementsByTagName("credit_year").item(0).getTextContent());
  } catch (Exception ex) {
    ex.printStackTrace();
  }

}

4 个答案:

答案 0 :(得分:8)

这里有一些代码可以指导您使用DOM来解析XML文件。您错过了文档构建器。

    //Build the document from the xmlString
    DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
    DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
    Document doc = dBuilder.parse(new InputSource(new StringReader(xmlString)));

    //Get all the transaction elements and then loop over them
    NodeList transaction = doc.getElementsByTagName("transaction");
    for(int j = 0; j < transaction.getLength(); j++) {
        //Traverse down the transaction node till we get the billing info
        NodeList details = ((Element)transaction.item(j)).getElementsByTagName("details");
        NodeList account = ((Element)details.item(0)).getElementsByTagName("account");
        NodeList billinginfo = ((Element)account.item(0)).getElementsByTagName("billing_info");

        System.out.println("===Billing Info===");
        System.out.println("Type: "+((Element)billinginfo.item(0)).getAttribute("type"));

        //Get all children nodes from billing info
        NodeList billingChildren = billinginfo.item(0).getChildNodes();

        for(int i = 0; i < billingChildren.getLength(); i++) {
            Node current = billingChildren.item(i);
            //Only want stuff from ELEMENT nodes
            if(current.getNodeType() == Node.ELEMENT_NODE) {
                System.out.println(current.getNodeName()+": "+current.getTextContent());
            }
        }
    }

这会从您的示例中生成以下内容。

===Billing Info===
Type: credit_card
first_name: asdasdasd
last_name: asdasdasd23434
address1:
address2:
city:
state:
zip:
country:
phone:
vat_number:
card_type: Visa
year: 2039
month: 6
first_six: 111111
last_four: 9999

答案 1 :(得分:0)

如果可能,使用像Jackson这样的API来解析XML。 Here是一个可以帮助您的问题。

答案 2 :(得分:0)

您可以使用Declarative Stream Mapping (DSM)流解析库轻松解析复杂的XML。

您只需为要从XML提取的数据定义映射

以下是XML的映射定义。

DSM忽略名称空间。

result:     
   type: array
   path: /transactions/transaction       
   fields:
       source:          
       account:
          type: array
          path: details/account
          fields:       
             accountCode: 
               path: account_code                 
             firstName: 
               path: first_name                 
             lastName: 
               path: last_name                 
             first_six: 
               path: billing_info/first_six
               dataType: int                 
             last_four: 
               path: billing_info/last_four
               dataType: int                 
             card_type: 
               path: billing_info/card_type

用于解析XML的Java代码:

DSM dsm=new DSMBuilder(new File("path/to/mapping.yaml")).setType(DSMBuilder.TYPE.XML).create();
Object result=  dsm.toObject(xmlFileContent);
// json represntation fo result
dsm.getObjectMapper().writerWithDefaultPrettyPrinter().writeValue(System.out, object);

此处输出:

[ {
  "source" : "subscription",
  "account" : [ {
    "accountCode" : "234234234",
    "firstName" : "asdadad",
    "lastName" : "asdadasd3433",
    "card_type" : "Visa",
    "first_six" : 111111,
    "last_four" : 9999
  } ]
} ]

如果您想直接反序列化为POJO类,可以使用DSM

答案 3 :(得分:0)

您正在做eElement3.getElementsByTagName("credit_year"),但是您的xml中没有credit_year。在xml中,仅是“ year”而不是“ credit_year”。因此,请尝试做eElement3.getElementsByTagName("year")