This是XLS文件的链接。我试图使用Spreadsheet gem来提取XLS文件的内容。特别是,我想收集所有列标题,如(年,国民生产总值等)。但是,问题是它们不在同一行。例如,国民总收入由三排组成。我还想知道合并了多少行单元格以使单元格为'Year'。
我已经开始编写程序,我对此表示赞同:
require 'rubygems'
require 'open-uri'
require 'spreadsheet'
rows = Array.new
url = 'http://www.stats.gov.cn/tjsj/ndsj/2012/html/C0201e.xls'
doc = Spreadsheet.open (open(url))
sheet1 = doc.worksheet 0
sheet1.each do |row|
if row.is_a? Spreadsheet::Formula
# puts row.value
rows << row.value
else
# puts row
rows << row
end
# puts row.value
end
但是,现在我陷入困境,真的需要一些准则来继续。任何形式的帮助都很受欢迎。
答案 0 :(得分:3)
require 'rubygems'
require 'open-uri'
require 'spreadsheet'
rows = Array.new
temp_rows = Array.new
column_headers = Array.new
index = 0
url = 'http://www.stats.gov.cn/tjsj/ndsj/2012/html/C0201e.xls'
doc = Spreadsheet.open (open(url))
sheet1 = doc.worksheet 0
sheet1.each do |row|
rows << row.to_a
end
rows.each_with_index do |row,ind|
if row[0]=="Year"
index = ind
break
end
end
(index..7).each do |i|
# puts rows[i].inspect
if rows[i][0] =~ /[0-9]/
break
else
temp_rows << rows[i]
end
end
col_size = temp_rows[0].size
# puts temp_rows.inspect
col_size.times do |c|
temp_str = ""
temp_rows.each do |row|
temp_str +=' '+ row[c] unless row[c].nil?
end
# puts temp_str.inspect
column_headers << temp_str unless temp_str.nil?
end
puts 'Column Headers of this xls file are : '
# puts column_headers.inspect
column_headers.each do |col|
puts col.strip.inspect if col.length >1
end