我正在使用Rails 4.2.7与Ruby(2.3)和Nokogiri。如何找到表中最直接的tr子项,而不是嵌套的子项?目前我在表格中找到表格行...
tables = doc.css('table')
tables.each do |table|
rows = table.css('tr')
这不仅可以找到表的直接行,例如
<table>
<tbody>
<tr>…</tr>
但它也会在行中找到行,例如
<table>
<tbody>
<tr>
<td>
<table>
<tr>This is found</tr>
</table>
</td>
</tr>
如何优化搜索以仅查找直接tr元素?
答案 0 :(得分:0)
我不知道是否可以直接用css / xpath完成,所以我写了一个小方法,以递归方式查找节点。它会在找到后立即停止递归。
xml= %q{
<root>
<table>
<tbody>
<tr nested="false">
<td>
<table>
<tr nested="true">
This is found</tr>
</table>
</td>
</tr>
</tbody>
</table>
<another_table>
<tr nested = "false">
<tr nested = "true">
</tr>
</another_table>
<tr nested = "false"/>
</root>
}
require 'nokogiri'
doc = Nokogiri::XML.parse(xml)
class Nokogiri::XML::Node
def first_children_found(desired_node)
if name == desired_node
[self]
else
element_children.map{|child|
child.first_children_found(desired_node)
}.flatten
end
end
end
doc.first_children_found('tr').each do |tr|
puts tr["nested"]
end
#=>
# false
# false
# false
答案 1 :(得分:0)
您可以使用XPath在几个步骤中完成此操作。首先,您需要找到table
的“级别”(即它在其他表中的嵌套方式),然后查找具有相同tr
个祖先数的所有后代table
:< / p>
tables = doc.xpath('//table')
tables.each do |table|
level = table.xpath('count(ancestor-or-self::table)')
rows = table.xpath(".//tr[count(ancestor::table) = #{level}]")
# do what you want with rows...
end
在更一般的情况下,您可能tr
直接嵌套其他tr
,您可以执行以下操作(这可能是无效的HTML,但您可能有XML或其他一些标记):
tables.each do |table|
# Find the first descendant tr, and determine its level. This
# will be a "top-level" tr for this table. "level" here means how
# many tr elements (including itself) are between it and the
# document root.
level = table.xpath("count(descendant::tr[1]/ancestor-or-self::tr)")
# Now find all descendant trs that have that same level. Since
# the table itself is at a fixed level, this means all these nodes
# will be "top-level" rows for this table.
rows = table.xpath(".//tr[count(ancestor-or-self::tr) = #{level}]")
# handle rows...
end
第一步可以分为两个单独的查询,这可能更清楚:
first_tr = table.at_xpath(".//tr")
level = first_tr.xpath("count(ancestor-or-self::tr)")
(如果有一个表没有tr
,这将失败,因为first_tr
将是nil
。上面的组合XPath可以正确处理这种情况。)