慢慢地实现我想要实现的目标。我通过屏幕抓取抓取数据,并希望将数据保存到我的模型,我有两列,home_team和away_team。到目前为止,我抓住了数据。
FIXTURE_URL = "http://www.bbc.co.uk/sport/football/premier-league/fixtures"
def get_fixtures # Get me all Home and away Teams
doc = Nokogiri::HTML(open(FIXTURE_URL))
home_team = doc.css(".team-home.teams").map {|h| h.text.strip }
away_team = doc.css(".team-away.teams").map {|a| a.text.strip }
#team_clean = Hash[:home_team => home_team, :away_team => away_team]
#team_clean = Hash[:team_clean => [Hash[:home_team => home_team, :away_team => away_team]]]
end
我已经将两种获取数据的方法变为哈希,一种是散列,另一种是哈希中的哈希,我不确定我需要哪一种(如果有的话)?
因此,如果我想保存从home_team收到的数据,我会执行rake任务来执行此操作
def update_fixtures #rake task method
Fixture.destroy_all
get_fixtures.each {|home| Fixture.create(:home_team => home )}
end
我想要实现的是能够同时保存home_team和away_team。我是否需要访问哈希中的数据,如果是这样的话?有点丢失,但这是我第一次尝试这个
任何帮助表示赞赏
答案 0 :(得分:2)
试试这个,
FIXTURE_URL = "http://www.bbc.co.uk/sport/football/premier-league/fixtures"
def get_fixtures # Get me all Home and away Teams
doc = Nokogiri::HTML(open(FIXTURE_URL))
matches = doc.css('tr.preview')
matches.each do |match|
home_team = match.css('.team-home').text.strip
away_team = match.css('.team-away').text.strip
Fixture.create!(home_team: home_team, away_team: away_team)
end
end
这将循环比赛并为每场比赛创建一个新的Fixture
,包括客场和主队。
修改强>
添加了.text.strip
编辑2:
这也可以为你提供日期,
FIXTURE_URL = "http://www.bbc.co.uk/sport/football/premier-league/fixtures"
def get_fixtures # Get me all Home and away Teams
doc = Nokogiri::HTML(open(FIXTURE_URL))
days = doc.css('#fixtures-data h2').each do |h2_tag|
date = Date.parse(h2_tag.text.strip)
matches = h2_tag.xpath('following-sibling::*[1]').css('tr.preview')
matches.each do |match|
home_team = match.css('.team-home').text.strip
away_team = match.css('.team-away').text.strip
Fixture.create!(home_team: home_team, away_team: away_team, date: date)
end
end
end
它比前面的代码复杂一点,因为它必须使用一些XPath来调用包含日期的h2
标记之后的下一个HTML元素。
它遍历h2
HTML中的所有div#fixtures-data
html标记,然后在每个table
之后/之后直接抓取h2
标记。