Java:迭代org.w3c.dom.Document中所有元素的最有效方法?

时间:2011-03-22 04:55:16

标签: java xml dom iteration

在Java中迭代所有DOM元素的最有效方法是什么?

这样的东西,但对于当前org.w3c.dom.Document的每个DOM元素?

for(Node childNode = node.getFirstChild(); childNode!=null;){
    Node nextChild = childNode.getNextSibling();
    // Do something with childNode, including move or delete...
    childNode = nextChild;
}

3 个答案:

答案 0 :(得分:117)

基本上,您有两种方法可以迭代所有元素:

<强> 1。使用递归(我认为最常见的方式):

public static void main(String[] args) throws SAXException, IOException,
        ParserConfigurationException, TransformerException {

    DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory
        .newInstance();
    DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
    Document document = docBuilder.parse(new File("document.xml"));
    doSomething(document.getDocumentElement());
}

public static void doSomething(Node node) {
    // do something with the current node instead of System.out
    System.out.println(node.getNodeName());

    NodeList nodeList = node.getChildNodes();
    for (int i = 0; i < nodeList.getLength(); i++) {
        Node currentNode = nodeList.item(i);
        if (currentNode.getNodeType() == Node.ELEMENT_NODE) {
            //calls this method for all the children which is Element
            doSomething(currentNode);
        }
    }
}

<强> 2。使用getElementsByTagName()方法以*作为参数避免递归:

public static void main(String[] args) throws SAXException, IOException,
        ParserConfigurationException, TransformerException {

    DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory
            .newInstance();
    DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
    Document document = docBuilder.parse(new File("document.xml"));

    NodeList nodeList = document.getElementsByTagName("*");
    for (int i = 0; i < nodeList.getLength(); i++) {
        Node node = nodeList.item(i);
        if (node.getNodeType() == Node.ELEMENT_NODE) {
            // do something with the current element
            System.out.println(node.getNodeName());
        }
    }
}

我认为这些方式都很有效 希望这会有所帮助。

答案 1 :(得分:35)

for (int i = 0; i < nodeList.getLength(); i++)

更改为

for (int i = 0, len = nodeList.getLength(); i < len; i++)

更有效率。

javanna回答的第二种方式可能是最好的,因为它倾向于使用更平坦,可预测的记忆模型。

答案 2 :(得分:2)

我最近也偶然发现了这个问题。这是我的解决方案。 我想避免递归,所以我使用了while循环。

由于在列表中的任意位置添加和删除, 我选择了LinkedList

/* traverses tree starting with given node */
  private static List<Node> traverse(Node n)
  {
    return traverse(Arrays.asList(n));
  }

  /* traverses tree starting with given nodes */
  private static List<Node> traverse(List<Node> nodes)
  {
    List<Node> open = new LinkedList<Node>(nodes);
    List<Node> visited = new LinkedList<Node>();

    ListIterator<Node> it = open.listIterator();
    while (it.hasNext() || it.hasPrevious())
    {
      Node unvisited;
      if (it.hasNext())
        unvisited = it.next();
      else
        unvisited = it.previous();

      it.remove();

      List<Node> children = getChildren(unvisited);
      for (Node child : children)
        it.add(child);

      visited.add(unvisited);
    }

    return visited;
  }

  private static List<Node> getChildren(Node n)
  {
    List<Node> children = asList(n.getChildNodes());
    Iterator<Node> it = children.iterator();
    while (it.hasNext())
      if (it.next().getNodeType() != Node.ELEMENT_NODE)
        it.remove();
    return children;
  }

  private static List<Node> asList(NodeList nodes)
  {
    List<Node> list = new ArrayList<Node>(nodes.getLength());
    for (int i = 0, l = nodes.getLength(); i < l; i++)
      list.add(nodes.item(i));
    return list;
  }