如何将图片从URL保存到磁盘

时间:2014-11-12 07:45:39

标签: ruby-on-rails ruby nokogiri

我想从网址下载图片,例如:http://trinity.e-stile.ru/并将图片保存到" C:\ pickaxe \ pictures"等目录中。使用Nokogiri非常重要。

我在本网站上阅读了类似的问题,但我没有找到它是如何工作的,我也不了解算法。

  1. 我编写了代码,用于解析网址并将部分网页源代码放入" img"标记到链接对象:

    require 'nokogiri'
    require 'open-uri'
    
    PAGE_URL="http://trinity.e-stile.ru/"
    page=Nokogiri::HTML(open(PAGE_URL))   #parsing into object
    links=page.css("img") #object with html code with img tag
    puts links.length # it is 24 images on this url
    puts
    links.each{|i| puts i } #it looks like: <img border="0" alt="" src="/images/kroliku.jpg"> 
    puts
    puts
    links.each{|link| puts link['src'] } #/images/kroliku.jpg 
    

    抓取HTML代码后用什么方法保存图片?

  2. 如何将图像放入磁盘上的目录?

  3. 我更改了代码,但它有一个错误:

    /home/action/.parts/packages/ruby2.1/2.1.1/lib/ruby/2.1.0/net/http.rb:879:in `initialize': getaddrinfo: Name or service not known (SocketError)
    

    这是现在的代码:

    require 'nokogiri'
    require 'open-uri'
    require 'net/http'
    
    LOCATION = 'pics'
    if !File.exist? LOCATION         # create folder if it is not exist
        require 'fileutils'
        FileUtils.mkpath LOCATION
    end
    
    #PAGE_URL = "http://ruby.bastardsbook.com/files/hello-webpage.html"
    #PAGE_URL="http://trinity.e-stile.ru/"
    PAGE_URL="http://www.youtube.com/"
    page=Nokogiri::HTML(open(PAGE_URL))   
    links=page.css("img")
    
    links.each{|link| 
        Net::HTTP.start(PAGE_URL) do |http|
          localname = link.gsub /.*\//, '' # left the filename only
          resp = http.get link['src']
          open("#{LOCATION}/#{localname}", "wb") do |file|
            file.write resp.body
          end
        end
     }
    

2 个答案:

答案 0 :(得分:1)

你差不多完成了。唯一剩下的就是存储文件。我们来做吧。

LOCATION = 'C:\pickaxe\pictures'
if !File.exist? LOCATION         # create folder if it is not exist
    require 'fileutils'
    FileUtils.mkpath LOCATION
end

require 'net/http'
.... # your code with nokogiri etc.
links.each{|link| 
    Net::HTTP.start(PAGE_URL) do |http|
      localname = link.gsub /.*\//, '' # left the filename only
      resp = http.get link['src']
      open("#{LOCATION}/#{localname}", "wb") do |file|
        file.write resp.body
      end
    end
end

就是这样。

答案 1 :(得分:0)

正确的版本:

require 'nokogiri'
require 'open-uri'


LOCATION = 'pics'
if !File.exist? LOCATION         # create folder if it is not exist
    require 'fileutils'
    FileUtils.mkpath LOCATION
end

#PAGE_URL="http://trinity.e-stile.ru/"
PAGE_URL="http://www.youtube.com/"

page=Nokogiri::HTML(open(PAGE_URL)) 
links=page.css("img")

links.each{|link|
  uri = URI.join(PAGE_URL, link['src'] ).to_s # make absolute uri
  localname=File.basename(link['src'])
   File.open("#{LOCATION}/#{localname}",'wb') { |f| f.write(open(uri).read) }
  }