我正在将XML表格更改为HTML表格,并且必须重新排列节点。
为了完成转换,我抓取XML,将其放入二维数组,然后构建新的HTML以输出。
但是其中一些单元格中包含HTML标记,转换后<su>
变为>su<
。
XML数据是:
<BOXHD>
<CHED H="1">Disc diameter, inches (cm)</CHED>
<CHED H="1">One-half or more of disc covered</CHED>
<CHED H="2">Number <SU>1</SU>
</CHED>
<CHED H="2">Exhaust foot <SU>3</SU>/min.</CHED>
<CHED H="1">Disc not covered</CHED>
<CHED H="2">Number <SU>1</SU>
</CHED>
<CHED H="2">Exhaust foot<SU>3</SU>/min.</CHED>
</BOXHD>
我将其转换为HTML表格的步骤是:
class TableCell
attr_accessor :text, :rowspan, :colspan
def initialize(text='')
@text = text
@rowspan = 1
@colspan = 1
end
end
@frag = Nokogiri::HTML(xml)
# make a 2d array to store how the cells should be arranged
column = 0
prev_row = -1
@frag.xpath("boxhd/ched").each do |ched|
row = ched.xpath("@h").first.value.to_i - 1
if row <= prev_row
column +=1
end
prev_row = row
@data[row][column] = TableCell.new(ched.inner_html)
end
# methods to find colspan and rowspan, put them in @data
# ... snip ...
# now build an html table
doc = Nokogiri::HTML::DocumentFragment.parse ""
Nokogiri::HTML::Builder.with(doc) do |html|
html.table {
@data.each do |tr|
html.tr {
tr.each do |th|
next if th.nil?
html.th(:rowspan => th.rowspan, :colspan => th.colspan).table_header th.text
end
}
end
}
end
这给出了以下HTML(注意上标被转义):
<table>
<tr>
<th rowspan="2" colspan="1" class="table_header">Disc diameter, inches (cm)</th>
<th rowspan="1" colspan="2" class="table_header">One-half or more of disc covered</th>
<th rowspan="1" colspan="2" class="table_header">Disc not covered</th>
</tr>
<tr>
<th rowspan="1" colspan="1" class="table_header">Number <su>1</su> </th>
<th rowspan="1" colspan="1" class="table_header">Exhaust foot <su>3</su>/min.</th>
<th rowspan="1" colspan="1" class="table_header">Number <su>1</su></th>
<th rowspan="1" colspan="1" class="table_header">Exhaust foot<su>3</su>/min.</th>
</tr>
</table>
如何获取原始HTML而不是实体?
我试过这些但没有成功
@data[row][column] = TableCell.new(ched.children)
@data[row][column] = TableCell.new(ched.children.to_s)
@data[row][column] = TableCell.new(ched.to_s)
答案 0 :(得分:1)
答案 1 :(得分:0)
我放弃了构建器,只是构建了HTML:
headers = html_headers()
def html_headers()
rows = Array.new
@data.each do |row|
cells = Array.new
row.each do |cell|
next if cell.nil?
cells << "<th rowspan=\"%d\" colspan=\"%d\">%s</th>" %
[cell.rowspan,
cell.colspan,
cell.text]
end
rows << "<tr>%s</tr>" % cells.join
end
rows.join
end
def replace_nodes(headers)
# ... snip ...
@frag.xpath("boxhd").each do |old|
puts "replacing boxhd..."
old.replace headers
end
# ... snip ...
end
我不明白为什么,但似乎我替换了<BOXHD>
标签的文本被解析和搜索,因为我能够从cell.text
中的数据更改标签名称。