Ruby和Https:尝试对无法访问的网络进行套接字操作

时间:2014-09-01 22:13:53

标签: ruby https

我正在尝试从课程中下载所有课堂笔记。我认为,因为我正在学习ruby,所以这将是一个很好的练习练习,下载所有PDF以供将来使用。不幸的是,我得到一个例外,说ruby由于某种原因无法连接。这是我的代码:

require 'net/http'

module Coursera 

  class Downloader
    attr_accessor :page_url
    attr_accessor :destination_directory
    attr_accessor :cookie
    def initialize(page_url,dest,cookie)
      @page_url=page_url
      @destination_directory = dest
      @cookie=cookie
    end
    def download
      puts @page_url
      request = Net::HTTP::Get.new(@page_url)
      puts @cookie.encoding
      request['Cookie']=@cookie
      # the line below is where the exception is thrown
      res = Net::HTTP.start(@page_url.hostname, use_ssl=true,@page_url.port) {|http|
        http.request(request)  
      }
      html_page = res.body
      pattern = /http[^\"]+\.pdf/
      i=0
      while (match = pattern.match(html_page,i)) != nil do
        # 0 is the entire string.
        url_string = match[0]
        # make sure that 'i' is updated
        i = match.begin(0)+1
        # we want just the name of the file.
        j = url_string.rindex("/")
        filename = url_string[j+1..url_string.length]
        destination = @destination_directory+"\\"+filename
        # I want to download that resource to that file.
        uri = URI(url_string)
        res = Net::HTTP.get_response(uri)
        # write that body to the file
        f=File.new(destination,mode="w")
        f.print(res.body)
      end
    end
  end
end

page_url_string = 'https://class.coursera.org/datasci-002/lecture'
puts page_url_string.encoding
dest='C:\\Users\\michael\\training material\\data_science'
page_url=URI(page_url_string)
# I copied this from my browsers developer tools, I'm omitting it since 
# it's long and has my session key in it
cookie="..."
downloader = Coursera::Downloader.new(page_url,dest,cookie)
downloader.download

在运行时,以下内容将写入控制台:

Fast Debugger (ruby-debug-ide 0.4.22, debase 0.0.9) listens on 127.0.0.1:65485
UTF-8
https://class.coursera.org/datasci-002/lecture
UTF-8
Uncaught exception: A socket operation was attempted to an unreachable network. - connect(2)
    C:/Ruby200-x64/lib/ruby/2.0.0/net/http.rb:878:in `initialize'
    C:/Ruby200-x64/lib/ruby/2.0.0/net/http.rb:878:in `open'
    C:/Ruby200-x64/lib/ruby/2.0.0/net/http.rb:878:in `block in connect'
    C:/Ruby200-x64/lib/ruby/2.0.0/timeout.rb:52:in `timeout'
    C:/Ruby200-x64/lib/ruby/2.0.0/net/http.rb:877:in `connect'
    C:/Ruby200-x64/lib/ruby/2.0.0/net/http.rb:862:in `do_start'
    C:/Ruby200-x64/lib/ruby/2.0.0/net/http.rb:851:in `start'
    C:/Ruby200-x64/lib/ruby/2.0.0/net/http.rb:582:in `start'
    C:/Users/michael/Documents/Aptana Studio 3 Workspace/practice/CourseraDownloader.rb:20:in `download'
    C:/Users/michael/Documents/Aptana Studio 3 Workspace/practice/CourseraDownloader.rb:52:in `<top (required)>'
    C:/Ruby200-x64/bin/rdebug-ide:23:in `load'
    C:/Ruby200-x64/bin/rdebug-ide:23:in `<main>'
C:/Ruby200-x64/lib/ruby/2.0.0/net/http.rb:878:in `initialize': A socket operation was attempted to an unreachable network. - connect(2) (Errno::ENETUNREACH)
    from C:/Ruby200-x64/lib/ruby/2.0.0/net/http.rb:878:in `open'
    from C:/Ruby200-x64/lib/ruby/2.0.0/net/http.rb:878:in `block in connect'
    from C:/Ruby200-x64/lib/ruby/2.0.0/timeout.rb:52:in `timeout'
    from C:/Ruby200-x64/lib/ruby/2.0.0/net/http.rb:877:in `connect'
    from C:/Ruby200-x64/lib/ruby/2.0.0/net/http.rb:862:in `do_start'
    from C:/Ruby200-x64/lib/ruby/2.0.0/net/http.rb:851:in `start'
    from C:/Ruby200-x64/lib/ruby/2.0.0/net/http.rb:582:in `start'
    from C:/Users/michael/Documents/Aptana Studio 3 Workspace/practice/CourseraDownloader.rb:20:in `download'
    from C:/Users/michael/Documents/Aptana Studio 3 Workspace/practice/CourseraDownloader.rb:52:in `<top (required)>'
    from C:/Ruby200-x64/lib/ruby/gems/2.0.0/gems/ruby-debug-ide-0.4.22/lib/ruby-debug-ide.rb:86:in `debug_load'
    from C:/Ruby200-x64/lib/ruby/gems/2.0.0/gems/ruby-debug-ide-0.4.22/lib/ruby-debug-ide.rb:86:in `debug_program'
    from C:/Ruby200-x64/lib/ruby/gems/2.0.0/gems/ruby-debug-ide-0.4.22/bin/rdebug-ide:110:in `<top (required)>'
    from C:/Ruby200-x64/bin/rdebug-ide:23:in `load'
    from C:/Ruby200-x64/bin/rdebug-ide:23:in `<main>'

我按照说明here编写了所有HTTP代码。据我所知,我正在跟随他们.- / p>

我正在使用Windows 7,ruby 2.0.0p481和Aptana Studio 3.当我将网址复制到我的浏览器时,它直接进入页面没有问题。当我在浏览器中查看该URL的请求标题时,我看不到其他任何我认为我遗失的内容。我也尝试过设置Host和Referer请求标头,但没有区别。

我没有想法,并且已经搜索了Stack Overflow以寻找类似的问题,但这并没有帮助。请让我知道我错过了什么。

1 个答案:

答案 0 :(得分:0)

所以,我有一个不同项目的相同错误消息,问题是我的机器字面上无法连接到IP /端口。你尝试过卷曲吗?如果它在您的浏览器中工作,它可能正在使用代理或其他东西实际到达那里。用curl测试URL解决了我的问题。