应用错误收集

Nokogiri以递归方式获得所有孩子

时间：2012-04-09 16:18:27

标签： ruby search xhtml nokogiri

问题

我正在针对各种URL运行一些统计信息。我想找到最集中的孩子数量的顶级元素。我想要遵循的方法是识别所有顶级元素，然后确定页面上所有元素的百分比属于它。

目标

递归获取给定元素的所有子元素。

输入：Nokogiri元素

输出：一系列Nokogiri元素或子女总数

设置

Ruby 1.9.2
Nokogiri gem

我最终想出了什么（这可行，但不如下面我选择的答案那么漂亮）

getChildCount(elem)
    children = elem.children
    return 0 unless children and children.count > 0
    child_count = children.count
    children.each do |child|
        child_count += getChildCount(child)
    end
    child_count
end

2 个答案:

答案 0 :(得分：30)

traverse method以递归方式将当前节点和所有子节点生成块。

# if you would like it to be returned as an array, rather than each node being yielded to a block, you can do this
result = []
doc.traverse {|node| result << node }
result

# or, 
require 'enumerator'
result = doc.enum_for(:traverse).map

答案 1 :(得分：8)

# Non-recursive
class Nokogiri::XML::Node
  def descendant_elements
    xpath('.//*')
  end
end

# Recursive 1
class Nokogiri::XML::Node
  def descendant_elements
    element_children.map{ |kid|
      [kid, kid.descendant_elements]
    }.flatten
  end
end

# Recursive 2
class Nokogiri::XML::Node
  def descendant_elements
    kids = element_children.to_a
    kids.concat(kids.map(&:descendant_elements)).flatten
  end
end