保存Webscraped数据

时间:2011-07-20 06:07:52

标签: ruby rubygems

我正在试图抓一个网站。我能够从该网站获取数据。我无法将数据从scrape保存到我已包含的yaml文件中
我的代码:

require 'rubygems'
require 'open-uri'
require 'hpricot'

article = []
     doc = open("http://www.cmegroup.com/trading/interest-rates/cleared-otc/irs.html"{|f| Hpricot(f) }

      (doc/"/html/body/div/div/div/div/table/").each do |article|
      puts "#{article.inner_html}"
       end

File.open('test.yaml', 'w') { |f|
f <<article.to_yaml
}

1 个答案:

答案 0 :(得分:0)

首先,你错过了open调用的右括号(在{} {}之前的)

当你添加它时,你会发现你会得到NoMethodErrorundefined method 'to_yaml' for []:Array)。要解决这个问题,你需要require 'yaml',它会为Array类提供猴子补丁。之后你会注意到你的yaml文件是空的,因为你从未在article中添加任何内容。这是一个固定版本:

require 'rubygems'
require 'open-uri'
require 'hpricot'
require 'yaml'

articles = []
url = "http://www.cmegroup.com/trading/interest-rates/cleared-otc/irs.html"
doc = open(url) {|f| Hpricot(f) }

  (doc/"/html/body/div/div/div/div/table/").each do |article|
    articles << article.inner_html
  end

File.open('test.yaml', 'w') { |f| f << articles.to_yaml }