使用Java检索XML文件的节点列表

时间:2016-03-07 07:45:27

标签: java xml dom xml-parsing

我有一个XML文件,如下所示

    <?xml version="1.0"?>
<?xml-stylesheet href="catalog.xsl" type="text/xsl"?>
<!DOCTYPE catalog SYSTEM "catalog.dtd">
<catalog>
   <product description="Cardigan Sweater" product_image="cardigan.jpg">
      <catalog_item gender="Men's">
         <item_number>QWZ5671</item_number>
         <price>39.95</price>
         <size description="Medium">
            <color_swatch image="red_cardigan.jpg">Red</color_swatch>
            <color_swatch image="burgundy_cardigan.jpg">Burgundy</color_swatch>
         </size>
         <size description="Large">
            <color_swatch image="red_cardigan.jpg">Red</color_swatch>
            <color_swatch image="burgundy_cardigan.jpg">Burgundy</color_swatch>
         </size>
      </catalog_item>
      <catalog_item gender="Women's">
         <item_number>RRX9856</item_number>
         <price>42.50</price>
         <size description="Small">
            <color_swatch image="red_cardigan.jpg">Red</color_swatch>
            <color_swatch image="navy_cardigan.jpg">Navy</color_swatch>
            <color_swatch image="burgundy_cardigan.jpg">Burgundy</color_swatch>
         </size>
         <size description="Medium">
            <color_swatch image="red_cardigan.jpg">Red</color_swatch>
            <color_swatch image="navy_cardigan.jpg">Navy</color_swatch>
            <color_swatch image="burgundy_cardigan.jpg">Burgundy</color_swatch>
            <color_swatch image="black_cardigan.jpg">Black</color_swatch>
         </size>
      </catalog_item>
   </product>
</catalog>

在java中提取特定名称(catalog_item)中的所有节点并创建List(目录项列表)的最佳方法是什么。 请注意,XML将包含任何节点列表,其中我应该能够指定节点的名称并提取该名称的所有节点以生成列表。

3 个答案:

答案 0 :(得分:2)

您可以使用像Jsoup这样的HTML解析器下载并将jar文件添加到您的项目中。然后这样做。

Document document = Jsoup.parse(html);
Elements elements = document.select("catalog_item"); //get everything under catalog_item

for (Element element : elements) {
    String number = element.getElementsByTag("price").text(); // select specific tag
    // select rest of info from tags you need
}

答案 1 :(得分:1)

以下是在vtd-xml中进行节点提取的代码。提取逻辑是你需要填写的...

import com.ximpleware.*;

public class retrieveNodes{
    public  static  void main(String s[]) throws VTDException,java.io.UnsupportedEncodingException,java.io.IOException{
        VTDGen vg = new VTDGen();
        vg.setLCDepth(5);
        if (!vg.parseFile("input.xml", false))
            return;
        VTDNav vn = vg.getNav();
        AutoPilot ap = new AutoPilot(vn);
        ap.selectXPath("/catalog/product/catalog_item");
        int i=0;
        while((i=ap.evalXPath())!=-1){
           if (vn.toElement(VTDNav.FIRST_CHILD,"itemNumber")){
               int j=vn.getText();
               if (j!=-1)
                  System.out.println("text node ==>"+vn.toString(j);
               vn.toElement(VTDNav.PARENT);
           }
        }

    }

}

答案 2 :(得分:0)

我想详细发布我的方法,以帮助需要相同方案帮助的人。

DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(new File("C:/ProductItems.xml"));
doc.getDocumentElement().normalize();

//Reading all the catelog items and store in a NodeList
NodeList catItemList=doc.getElementsByTagName("catalog_item");

if(catItemList.getLength()>0){      //if there are catelog items 
    for(int itemIndex=0 ; itemIndex < catItemList.getLength() ; itemIndex++){
            Node catalogItem=catItemList.item(itemIndex);

            if (catalogItem.getNodeType() == Node.ELEMENT_NODE) {
                 Element eElement = (Element) catalogItem;
                 String gender = eElement.getAttribute("gender");
            }
    }
}