我目前正在编写一个迭代URL列表的脚本,并对它们进行一些处理。我列表中的一个URL给了我一个问题。代码如下:
url = "https://secure.www.alumniconnections.com/olc/pub/CDB/events/attendance.cgi? tmpl=attendance&event=2309515&sort=4"
uri = URI.parse(url)
response = Net::HTTP.get_response(uri)
最后一行引发了以下错误:
EOFError: end of file reached
from /usr/lib/ruby/1.8/net/protocol.rb:135:in `sysread'
from /usr/lib/ruby/1.8/net/protocol.rb:135:in `rbuf_fill'
from /usr/lib/ruby/1.8/timeout.rb:67:in `timeout'
from /usr/lib/ruby/1.8/timeout.rb:101:in `timeout'
from /usr/lib/ruby/1.8/net/protocol.rb:134:in `rbuf_fill'
from /usr/lib/ruby/1.8/net/protocol.rb:116:in `readuntil'
from /usr/lib/ruby/1.8/net/protocol.rb:126:in `readline'
from /usr/lib/ruby/1.8/net/http.rb:2028:in `read_status_line'
from /usr/lib/ruby/1.8/net/http.rb:2017:in `read_new'
from /usr/lib/ruby/1.8/net/http.rb:1051:in `request'
from /usr/lib/ruby/1.8/net/http.rb:948:in `request_get'
from /usr/lib/ruby/1.8/net/http.rb:380:in `get_response'
from /usr/lib/ruby/1.8/net/http.rb:543:in `start'
from /usr/lib/ruby/1.8/net/http.rb:379:in `get_response'
from (irb):5
from /usr/lib/ruby/1.8/uri/ftp.rb:190
我的列表中没有其他网址似乎给了我任何悲伤。谁能解释为什么我会收到这个错误?
答案 0 :(得分:6)
我输入https://secure.www.alumniconnections.com/,似乎将我重定向到http://www.harrisconnect.com/。我的猜测是你的代码无法处理重定向。尝试使用Mechanize(http://mechanize.rubyforge.org/)来处理这个问题。另外,我建议您将代码包装在一些错误处理中,例如:
# Prevent Infinite Loops
counter = 0
begin
# Your Code Here
rescue EOFError
puts "encountered EOFError"
# Fail the connection after 3 attempts
if counter < 3
counter += 1
puts "redo: #{counter}"
redo
else
puts "FAILED CONNECTION #{counter} TIMES"
counter = 0
end
end
这将尝试重做在过去连接到很多网址时帮助我的连接。
编辑:
require 'rubygems'
require 'mechanize'
agent = Mechanize.new
html_text = agent.get("https://secure.www.alumniconnections.com/olc/pub/CDB/events/attendance.cgi?tmpl=attendance&event=2309515&sort=4").body
html_file = File.open("html_file.html", "w")
html_file.write(html_text)
html_file.close
这会将您的网页写入一个文件,对我来说没关系,所以试一试。
答案 1 :(得分:0)
如果它是HTTPS而不仅仅是HTTP,你可以尝试这个(在Ruby 1.8.6上工作):
require 'rubygems'
require "net/https"
require "uri"
address = "https://www.your-secure-domain-here.com"
uri = URI.parse(address)
http = Net::HTTP.new(uri.host, uri.port)
http.use_ssl = true
http.verify_mode = OpenSSL::SSL::VERIFY_NONE
request = Net::HTTP::Get.new(uri.request_uri)
request.basic_auth("username", "password")
response = http.request(request)
在我的示例中,我必须username
和password
而不是SECRET-API-KEY
和api_token
。
尝试一下,看看它是否有帮助。