在JAVA中解析XML,提取特定的数据

时间:2013-02-26 15:44:40

标签: java xml parsing

我不确定如何准确地提出我的问题。我相信两个问题可能有所帮助:

我一直在玩解析文件 - 尤其是xml。

我找到了很多教程和很多种技巧。

大多数教程都有一个简单的xml文件,首先包含姓名,电话号码等。

我的2个问题:

1)如何在特定的数据之间提取/显示数据。例如,如果我只想显示<FirstNames>我该怎么做(在Java中)以下内容:

loop

If <tag> = “FirstName” then name_variable = data in between tags);

or

If <tag> = “FirstName” then System.out.printf(“ the first name is %s\n”,name_variable);

end loop

2)假设我只查找名字的第二个实例,在一些教程/示例中,我已经看到如何在循环中显示所有数据。我试图将数据设置为等于“阵列”字符串,然后在循环外显示数据但已被删除。最重要的是,如何存储索引的(数组)解析的XML数据以供使用或传入以后的代码?

<company>
<Name>My Company</Name>
<Executive type = "CEO">
    <LastName>Smith</LastName>
    <FirstName>Jim</FirstName>
    <street>123 Main Street</street>
    <city>Mytown</city>
    <state>TN</state>
    <zip>11234</zip>
</Executive>
<Executive type = "OEC">
    <LastName>Jones</LastName>
    <FirstName>John</FirstName>
    <street>456 Main Street</street>
    <city>Gotham</city>
    <state>TN</state>
    <zip>11234</zip>
</Executive>
</company>

以下是我拼凑的一些代码,我从XML中获取了一些数据,但我还没有弄清楚如何存储索引的解析数据。

package dom_parsing_in_java;
import  org.w3c.dom.*;
import javax.xml.parsers.*;
import java.io.*;
import org.w3c.dom.NodeList;
import org.w3c.dom.Node;
import org.w3c.dom.NamedNodeMap;
//import com.sun.org.apache.xerces.internal.parsers.DOMParser;

public class DOM_Parsing_In_JAVA {

   public static void main(String[] args) {
    // TODO code application logic here
    String file = "test2.xml";

if(args.length >0){
    file = args[0];

}// end If

try{
    //DOMParser parser= new DOMParser();
    DocumentBuilderFactory factory= DocumentBuilderFactory.newInstance();
    DocumentBuilder builder = factory.newDocumentBuilder();
    Document document = builder.parse(new File(file));

    //Document document = parser.getDocument();

    Element root = document.getDocumentElement();
    System.out.println(root.getTagName());

    NodeList node_list = root.getElementsByTagName("Executive");


   //Node comp = getNode("Company",root);

    int i;


    for(i = 0; i<node_list.getLength();i++){
        Element department = (Element)node_list.item(i);

        System.out.println(department.getTagName());
        System.out.println("name "+document.getElementsByTagName("Name").item(0).getTextContent());
        System.out.println("name "+document.getElementsByTagName("FirstName").item(i).getTextContent());
        System.out.printf(" Lastname: %s%n ", document.getElementsByTagName("LastName").item(i));
        System.out.printf(" Lastname: %s%n ", department.getAttribute("LastName"));
        System.out.printf(" FirstName: %s%n",department.getAttribute("FirstName"));
        //System.out.printf(" elements by Tag %s%n",department.getElementsByTagName("testTag"));
        //System.out.printf(" staff: %s%n",countStaff(department));
    }

}
catch(Exception e){
    e.printStackTrace();

}//end catch
}
}

2 个答案:

答案 0 :(得分:0)

答案 1 :(得分:0)

我沿着XPath路由前进并将XML文件解析为Document。

XPath可用于导航XML文档。有关使用XPath可以实现的功能的更多信息,请参阅http://www.w3schools.com/xpath/default.asp

假设一切都在main中完成:

public static void main(String[] args) {
    DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
    DocumentBuilder builder = factory.newDocumentBuilder();
    Document doc = builder.parse(new File("file.xml"));
    XPathFactory xPathfactory = XPathFactory.newInstance();
    XPath xpath = xPathfactory.newXPath();
    XPathExpression firstnameExpr = xpath.compile("//FirstName");

    NodeList nl = (NodeList) firstnameExpr.evaluate(doc, XPathConstants.NODESET);

    for (int i=0; i<nl.getLength(); i++) {
        Node node = nl.item(i);

        // this is assuming the first child of Firstname is the characters (contents)
        // of the Firstname tag, you may need to do some checking whether or not
        // node.getNodeType() == Node.Text;
        System.out.println("Firstname["+i+"] = " 
                                + node.getChildNodes()[0].getTextContent());
    }


}

您可以将值添加到将保持顺序的ArrayList,而不是将名字内容打印到System.out,即:

List<String> firstnameList = new ArrayList<String>();

for (int i=0; i<nl.getLength(); i++) {
    Node node = nl.item(i);

    // again, you might want to check that .getChildNodes() doesn't return null
    // and that it is of type Node.Text
    firstnameList.add(node.getChildNodes()[0].getTextContent());
}