Question

我正在尝试使用Net :: HTTP从WordPress.org下载latest.zip。这是我到目前为止所得到的：

Net::HTTP.start("wordpress.org/") { |http|
  resp = http.get("latest.zip")
  open("a.zip", "wb") { |file| 
    file.write(resp.body)
  }
  puts "WordPress downloaded"
}

但这只给我一个4千字节的404错误HTML页面（如果我将文件更改为a.txt）。我认为这与URL有关可能会以某种方式重定向但我不知道我在做什么。我是Ruby的新手。

Answer 1

我的第一个问题是为什么要使用Net :: HTTP或代码来下载一些可以使用curl或wget更容易完成的东西，这些东西是为了便于下载文件而设计的？

但是，既然您想使用代码下载内容，我建议您查看Open-URI是否要遵循重定向。它是Ruby的标准库，对于快速访问页面和文件的HTTP / FTP非常有用：

require 'open-uri'

open('latest.zip', 'wb') do |fo|
  fo.print open('http://wordpress.org/latest.zip').read
end

我刚刚运行它，等待几秒钟才完成，对下载的文件“latest.zip”进行解压缩，然后展开到包含其内容的目录中。

除了Open-URI之外，还有HTTPClient和Typhoeus等，可以轻松打开HTTP连接并发送查询器/接收数据。他们非常强大，值得了解。

Answer 2

NET :: HTTP没有提供一种很好的方法来跟踪重定向，这里是我已经使用了一段时间的代码：

require 'net/http'
class RedirectFollower
  class TooManyRedirects < StandardError; end

  attr_accessor :url, :body, :redirect_limit, :response

  def initialize(url, limit=5)
    @url, @redirect_limit = url, limit
  end

  def resolve
    raise TooManyRedirects if redirect_limit < 0

    self.response = Net::HTTP.get_response(URI.parse(url))

    if response.kind_of?(Net::HTTPRedirection)      
      self.url = redirect_url
      self.redirect_limit -= 1

      resolve
    end

    self.body = response.body
    self
  end

  def redirect_url
    if response['location'].nil?
      response.body.match(/<a href=\"([^>]+)\">/i)[1]
    else
      response['location']
    end
  end
end



wordpress = RedirectFollower.new('http://wordpress.org/latest.zip').resolve
puts wordpress.url
File.open("latest.zip", "w") do |file|
  file.write wordpress.body
end

通过Net :: HTTP下载zip文件

2 个答案: