我正在试图抓一个网站。我能够从该网站获取数据。我无法将数据从scrape保存到我已包含的yaml文件中
我的代码:
require 'rubygems'
require 'open-uri'
require 'hpricot'
article = []
doc = open("http://www.cmegroup.com/trading/interest-rates/cleared-otc/irs.html"{|f| Hpricot(f) }
(doc/"/html/body/div/div/div/div/table/").each do |article|
puts "#{article.inner_html}"
end
File.open('test.yaml', 'w') { |f|
f <<article.to_yaml
}
答案 0 :(得分:0)
首先,你错过了open
调用的右括号(在{} {}之前的)
。
当你添加它时,你会发现你会得到NoMethodError
(undefined method 'to_yaml' for []:Array
)。要解决这个问题,你需要require 'yaml'
,它会为Array
类提供猴子补丁。之后你会注意到你的yaml文件是空的,因为你从未在article
中添加任何内容。这是一个固定版本:
require 'rubygems'
require 'open-uri'
require 'hpricot'
require 'yaml'
articles = []
url = "http://www.cmegroup.com/trading/interest-rates/cleared-otc/irs.html"
doc = open(url) {|f| Hpricot(f) }
(doc/"/html/body/div/div/div/div/table/").each do |article|
articles << article.inner_html
end
File.open('test.yaml', 'w') { |f| f << articles.to_yaml }