Java DOM解析不适用于深层xml结构

时间:2017-02-03 16:25:44

标签: java xml dom

Java代码

       XPathExpression readOcc = xpath.compile("//flexTM/attrGroupMany[contains(@name,'allergenRelatedInformation')]");
       Object rObj = (Object) readOcc.evaluate(doc,XPathConstants.NODESET);
       NodeList agm = (NodeList) rObj;



        System.out.println("" + agm.getLength());

        for (int i=0; i<agm.getLength(); i++){
                Element element = (Element) agm.item(i).getChildNodes();
                NodeList row = element.getElementsByTagName("row");
                 System.out.println("row len " + row.getLength());

                 for(int j=0;j<row.getLength(); j++){
                     Element eAttr = (Element) row.item(j);
                     System.out.println(eAttr.getNodeName());
                     NodeList attr = eAttr.getElementsByTagName("attrGroupMany");

                     for (int k=0;k<attr.getLength();k++){
                         Element eAgm = (Element) attr.item(k);
                         System.out.println(eAgm.getNodeName());
                         NodeList iattr = eAgm.getChildNodes();
                         System.out.println(iattr.getLength());
                         System.out.println(iattr.item(1).getNodeValue());
                         //NodeList iattr = eAgm.getElementsByTagName("row");

                         for(int l=0;i<iattr.getLength();l++){
                             Element iAttr = (Element) iattr.item(l);
                             System.out.println(iAttr.getNodeName());

                             //System.out.println(iAttr.getNodeValue());
                         }

                     }
                 }

XML

<item>

<attrGroupMany name="manufacturer">
              <row>
                 <attr name="gln">7689</attr>
                 <attr name="name">XYZ Inc</attr>
              </row>
           </attrGroupMany>
           <attrGroupMany name="allergenRelatedInformation">
              <row>
                 <attr name="allergenSpecificationAgency">FDA</attr>
                 <attr name="allergenSpecificationName">BIG 8</attr>
                 <attrGroupMany name="allergen">
                    <row>
                       <attr name="allergenTypeCode">AC</attr>
                       <attr name="levelOfContainmentCode">FREE_FROM</attr>
                    </row>
                 </attrGroupMany>
              </row>
              <row>
                 <attr name="allergenSpecificationAgency">FDA</attr>
                 <attr name="allergenSpecificationName">BIG 8</attr>
                 <attrGroupMany name="allergen">
                    <row>
                       <attr name="allergenTypeCode">AE</attr>
                       <attr name="levelOfContainmentCode">FREE_FROM</attr>
                    </row>
                 </attrGroupMany>
              </row>
              <row>
                 <attr name="allergenSpecificationAgency">FDA</attr>
                 <attr name="allergenSpecificationName">BIG 8</attr>
                 <attrGroupMany name="allergen">
                    <row>
                       <attr name="allergenTypeCode">AF</attr>
                       <attr name="levelOfContainmentCode">FREE_FROM</attr>
                    </row>
                 </attrGroupMany>
              </row>
              <row>
                 <attr name="allergenSpecificationAgency">FDA</attr>
                 <attr name="allergenSpecificationName">BIG 8</attr>
                 <attrGroupMany name="allergen">
                    <row>
                       <attr name="allergenTypeCode">AM</attr>
                       <attr name="levelOfContainmentCode">FREE_FROM</attr>
                    </row>
                 </attrGroupMany>
              </row>


           </attrGroupMany>
    </item>
 <item>

<attrGroupMany name="manufacturer">
              <row>
                 <attr name="gln">7689</attr>
                 <attr name="name">XYZ Inc</attr>
              </row>
           </attrGroupMany>
           <attrGroupMany name="allergenRelatedInformation">
              <row>
                 <attr name="allergenSpecificationAgency">FDA</attr>
                 <attr name="allergenSpecificationName">BIG 8</attr>
                 <attrGroupMany name="allergen">
                    <row>
                       <attr name="allergenTypeCode">AC</attr>
                       <attr name="levelOfContainmentCode">FREE_FROM</attr>
                    </row>
                 </attrGroupMany>
              </row>
              <row>
                 <attr name="allergenSpecificationAgency">FDA</attr>
                 <attr name="allergenSpecificationName">BIG 8</attr>
                 <attrGroupMany name="allergen">
                    <row>
                       <attr name="allergenTypeCode">AE</attr>
                       <attr name="levelOfContainmentCode">FREE_FROM</attr>
                    </row>
                 </attrGroupMany>
              </row>
              <row>
                 <attr name="allergenSpecificationAgency">FDA</attr>
                 <attr name="allergenSpecificationName">BIG 8</attr>
                 <attrGroupMany name="allergen">
                    <row>
                       <attr name="allergenTypeCode">AF</attr>
                       <attr name="levelOfContainmentCode">FREE_FROM</attr>
                    </row>
                 </attrGroupMany>
              </row>
              <row>
                 <attr name="allergenSpecificationAgency">FDA</attr>
                 <attr name="allergenSpecificationName">BIG 8</attr>
                 <attrGroupMany name="allergen">
                    <row>
                       <attr name="allergenTypeCode">AM</attr>
                       <attr name="levelOfContainmentCode">FREE_FROM</attr>
                    </row>
                 </attrGroupMany>
              </row>


           </attrGroupMany>
    </item>

在上面的XML中,有2个项目标签,每个标签都有自己的节点attrGroupMany,其属性为allergenRelatedInformation。我试图在每个级别解析xml,以便我可以打印父节点和子节点的所有值。不知道上面代码中的错误是什么,它失败了。

1 个答案:

答案 0 :(得分:0)

我建议你不要直接使用org.w3c.dom.*类,因为你最终得到的代码可能很难读懂。保持。您可以编写一个类层次结构,并使用JAXB将xml压入其中,这是一般的Java方法。

或者,如果您希望更直接地使用它,但是可以使用库Dynamics

让我举一个使用它的例子。假设xml是有效的,即具有单个父元素,那么我们就称之为数据&#39;。您的文档如下所示。

<data>
  <item>
    <!-- contents as you specified -->
  </item>
  <item>
    <!-- contents as you specified -->
  </item>
</data>

XmlDynamic实例将允许您以简单的零安全方式遍历结构,但具有相同的功率和放大器。直接

让我们来看看第一个&#39; attr&#39;第二个项目的名称属性

XmlDynamic allergenInfo = new XmlDynamic(xmlStringOrReaderOrInputSourceEtc);

String firstAttrName = allergenInfo
    .get("data|item[1]|attrGroupMany|row|attr|@name")
    .asString(); // gln

或浏览整个文档并打印attr名称&amp;值

allergenInfo.allChildren()
    .filter(hasElementName("attr")) // import static alexh.weak.XmlDynamic.hasElementName
    .filter(attr -> attr.get("@name").isPresent())
    .forEach(attr -> System.out.println(attr.get("@name").asString() + " -> " + attr.asString()));
// prints all attr names -> values

它是一个单一且轻量级的额外依赖,即在maven:

<dependency>
  <groupId>com.github.alexheretic</groupId>
  <artifactId>dynamics</artifactId>
  <version>4.0</version>
</dependency>

查看更多示例https://github.com/alexheretic/dynamics#xml-dynamics