如何解析Nokogiri的日期?

时间:2013-07-28 17:32:58

标签: ruby nokogiri ruby-on-rails-4

我正在努力解析我的旧高中橄榄球队的日程安排。

我设法获得包含每个游戏日期的节点,但是当我尝试将其转换为Ruby日期对象时,我收到无效的日期错误。但是,当我将puts gamedate生成的日期复制并粘贴到脚本中以进行测试时,它会很好地转换为日期对象。

gamedate字符串传递给strptime并将其作为硬编码输入粘贴之间有什么区别?

require 'rubygems'
require 'nokogiri'
require 'open-uri'
url = "http://www.maxpreps.com/high-schools/fitzgerald-hurricanes-(fitzgerald,ga)/football/schedule.htm"
doc = Nokogiri::HTML(open(url))
games = doc.css('.dual-contest')
games.each do |game|
  puts gamedate = game.css(".event-date").xpath('@title').to_s
  #works
  puts date = DateTime.strptime('2013-08-24T02:30:00','%Y-%m-%dT%H:%M:%S')
  #does not work
  puts date = DateTime.strptime(gamedate,'%Y-%m-%dT%H:%M:%S')
end

1 个答案:

答案 0 :(得分:1)

看原因:

require 'rubygems'
require 'nokogiri'
require 'open-uri'
url = "http://www.maxpreps.com/high-schools/fitzgerald-hurricanes-(fitzgerald,ga)/football/schedule.htm"
doc = Nokogiri::HTML(open(url))
games = doc.css('.dual-contest')
games.each do |game|
  puts gamedate = game.css(".event-date").xpath('@title').empty?
end

# >> true
# >> false
# >> false
# >> false
# >> false
# >> false
# >> false
# >> false
# >> false
# >> false
# >> false

从另一个角度来看,有一个表数据,其值为nil

require 'rubygems'
require 'nokogiri'
require 'open-uri'
url = "http://www.maxpreps.com/high-schools/fitzgerald-hurricanes-(fitzgerald,ga)/football/schedule.htm"
doc = Nokogiri::HTML(open(url))
games = doc.at_css('.dual-contest').at_css(".event-date").at_xpath('@title')
puts games
# ~> -:6:in `<main>': undefined method `at_xpath' for nil:NilClass (NoMethodError)

我会这样: -

require 'rubygems'
require 'nokogiri'
require 'open-uri'

url = "http://www.maxpreps.com/high-schools/fitzgerald-hurricanes-(fitzgerald,ga)/football/schedule.htm"
doc = Nokogiri::HTML(open(url))
doc.css('#schedule .event-date').each do |nd|
  dt = nd['title']
  p dt,DateTime.parse(dt)
end
# >> "2013-08-24T02:30:00"
# >> #<DateTime: 2013-08-24T02:30:00+00:00 ((2456529j,9000s,0n),+0s,2299161j)>
# >> "2013-09-07T02:30:00"
# >> #<DateTime: 2013-09-07T02:30:00+00:00 ((2456543j,9000s,0n),+0s,2299161j)>
# >> "2013-09-14T02:30:00"
# >> #<DateTime: 2013-09-14T02:30:00+00:00 ((2456550j,9000s,0n),+0s,2299161j)>
# >> "2013-09-21T02:30:00"
# >> #<DateTime: 2013-09-21T02:30:00+00:00 ((2456557j,9000s,0n),+0s,2299161j)>
# >> "2013-09-28T02:30:00"
# >> #<DateTime: 2013-09-28T02:30:00+00:00 ((2456564j,9000s,0n),+0s,2299161j)>
# >> "2013-10-05T02:30:00"
# >> #<DateTime: 2013-10-05T02:30:00+00:00 ((2456571j,9000s,0n),+0s,2299161j)>
# >> "2013-10-12T02:30:00"
# >> #<DateTime: 2013-10-12T02:30:00+00:00 ((2456578j,9000s,0n),+0s,2299161j)>
# >> "2013-10-19T02:30:00"
# >> #<DateTime: 2013-10-19T02:30:00+00:00 ((2456585j,9000s,0n),+0s,2299161j)>
# >> "2013-10-26T02:30:00"
# >> #<DateTime: 2013-10-26T02:30:00+00:00 ((2456592j,9000s,0n),+0s,2299161j)>
# >> "2013-11-02T02:30:00"
# >> #<DateTime: 2013-11-02T02:30:00+00:00 ((2456599j,9000s,0n),+0s,2299161j)>