Rails 5模型与Nokogri宝石返回零

时间:2017-03-15 14:19:37

标签: ruby web-scraping nokogiri ruby-on-rails-5 activemodel

我在Rails 5中的应用程序有问题。我创建了类scrape.rb 通过Nokogiri gem抓取HTML并将这些数据保存在另一个模型中,但是当我在rails控制台中创建新对象时,返回nil并且不会废弃任何值:

2.3.0 :018 > s = Scrape.new
 => #<Scrape:0x007fba68b79e98>
2.3.0 :019 > s.scrape_new_movie
 => nil
2.3.0 :020 >

这是scrape.rb型号

 class Scrape
  attr_accessor :title, :vote, :image_url, :description,

  def scrape_new_movie
    begin
      doc = Nokogiri::HTML(open("https://zalukaj.com/zalukaj-film/26280/barbie_w_wiecie_gier_barbie_video_game_hero_2017_.html").read, nil, 'utf-8')
      doc.css('script').remove
      self.title = doc.css('#pw_title.about_movie_title').text
      v = doc.css('#success_vote').text
      self.vote = v.slice(2...5)
      self.image_url = doc.css('.about_movie img').attr('src').text
      self.description = doc.css('#pw_description.e_s3k').text
      return true
      rescue Exception => e
      self.failure = "Something went wrong with the scrape"
    end
  end

  def save_movie
    movie = Movie.new(
      title: self.title,
      vote: self.vote,
      image_url: self.image_url,
      description: self.description
    )
    movie.save
  end
end

3 个答案:

答案 0 :(得分:0)

替换方法:

  def scrape_new_movie
    begin
      doc = Nokogiri::HTML(open("https://zalukaj.com/zalukaj-film/26280/barbie_w_wiecie_gier_barbie_video_game_hero_2017_.html").read, nil, 'utf-8')
      doc.css('script').remove
      self.title = doc.css('#pw_title.about_movie_title').text
      v = doc.css('#success_vote').text
      self.vote = v.slice(2...5)
      self.image_url = doc.css('.about_movie img').attr('src').text
      self.description = doc.css('#pw_description.e_s3k').text
      return true
      rescue Exception => e
      self.failure = "Something went wrong with the scrape"
    end
  end

  def scrape_new_movie
    doc = Nokogiri::HTML(open("https://zalukaj.com/zalukaj-film/26280/barbie_w_wiecie_gier_barbie_video_game_hero_2017_.html").read, nil, 'utf-8')
    doc.css('script').remove
    self.title = doc.css('#pw_title.about_movie_title').text
    v = doc.css('#success_vote').text
    self.vote = v.slice(2...5)
    self.image_url = doc.css('.about_movie img').attr('src').text
    self.description = doc.css('#pw_description.e_s3k').text
    return true
  end

然后发生的任何错误都会冒出来并以允许您调试问题的方式显示。

这是一个很好的例子,说明为什么你永远不应该rescue Exception因为总是让调试问题变得更难。请参阅:Why is it a bad style to `rescue Exception => e` in Ruby?

答案 1 :(得分:0)

根据您的设置方式,无需调用类的文字名称。只需添加自我。方法,不要叫新的。如果要调试此脚本,此脚本中还存在相当多的错误,我也会引发异常消息。你还应该将self.title =更改为@title =或者如果你想保留self.title,你需要添加类来继承self并将attr_accessor放在该类中。

class Scrape
  class << self
    attr_accessor :title, :vote, :image_url, :description, failure
  end 

  def self.scrape_new_movie
    begin
      doc = Nokogiri::HTML(open("https://zalukaj.com/zalukaj-film/26280/barbie_w_wiecie_gier_barbie_video_game_hero_2017_.html").read, nil, 'utf-8')
      doc.css('script').remove
      self.title = doc.css('#pw_title.about_movie_title').text
      v = doc.css('#success_vote').text
      self.vote = v.slice(2...5)
      self.image_url = doc.css('.about_movie img').attr('src').text
      self.description = doc.css('#pw_description.e_s3k').text
      return true
      rescue Exception => e
        raise e 
      end
    end

    def self.save_movie
      movie = Movie.new(
        title: self.title,
        vote: self.vote,
        image_url: self.image_url,
        description: self.description
      )
      movie.save
    end
 end      
 Scrape.scrape_new_movie

答案 2 :(得分:0)

返回nil的原因是因为您使用逗号结束了attr_accessor。变量failure也是未定义的,所以我假设你也需要一个attr_accessor。

你应该改变

  attr_accessor :title, :vote, :image_url, :description,

  attr_accessor :title, :vote, :image_url, :description, :failure