Jruby / Resque:来自resque作业的网络连接开始重复失败

时间:2011-09-06 18:20:23

标签: mysql ruby-on-rails jruby resque fog

我有一个关于resque工作的奇怪问题,我想知道是否有其他人遇到过。

我们正在jruby 1.6.2下运行resque

我们有一个运行任务,可以从各种URL下载一堆文件,使用Fog将这些文件上传到Rackspace Cloudfiles,然后在MySQL中存储有关这些文件的一些信息。在这已经持续了一段时间后,似乎我们的应用程序的网络堆栈崩溃了。在一个例子中,失败的第一个迹象是从这里暂停:

org/jruby/ext/openssl/SSLSocket.java:512:in `sysread'
/var/www/lisausa/shared/bundle/jruby/1.9/gems/jruby-openssl-0.7.4/lib/openssl/buffering.rb:35:in `fill_rbuff'
/var/www/lisausa/shared/bundle/jruby/1.9/gems/jruby-openssl-0.7.4/lib/openssl/buffering.rb:158:in `eof?'
/var/www/lisausa/shared/bundle/jruby/1.9/gems/jruby-openssl-0.7.4/lib/openssl/buffering.rb:133:in `readline'
/var/www/lisausa/shared/bundle/jruby/1.9/gems/excon-0.6.5/lib/excon/response.rb:22:in `parse'
/var/www/lisausa/shared/bundle/jruby/1.9/gems/excon-0.6.5/lib/excon/connection.rb:174:in `request'
/var/www/lisausa/shared/bundle/jruby/1.9/gems/fog-0.11.0/lib/fog/core/connection.rb:20:in `request'
/var/www/lisausa/shared/bundle/jruby/1.9/gems/fog-0.11.0/lib/fog/storage/rackspace.rb:107:in `request'

我们通常会在工作运行后大约10-15分钟开始看到这些问题。之后,我们会在每次后续尝试写入数据库时​​开始看到这一点......

ActiveRecord::JDBCError: Can not read response from server. Expected to read 4 bytes, read 0 bytes before connection was unexpectedly lost.: SELECT `bills`.* FROM `bills` WHERE (`bills`.state_session_id = 59)
ActiveRecord::StatementInvalid: ActiveRecord::JDBCError: Can not read response from server. Expected to read 4 bytes, read 0 bytes before connection was unexpectedly lost.: SELECT `bills`.* FROM `bills` WHERE (`bills`.state_session_id = 59)
/var/www/lisausa/shared/bundle/jruby/1.9/gems/activerecord-3.0.10/lib/active_record/connection_adapters/abstract_adapter.rb:207:in `log'
/var/www/lisausa/shared/bundle/jruby/1.9/gems/activerecord-3.0.10/lib/active_record/connection_adapters/abstract_adapter.rb:200:in `log'
/var/www/lisausa/shared/bundle/jruby/1.9/gems/activerecord-jdbc-adapter-1.1.1/lib/arjdbc/jdbc/adapter.rb:183:in `execute'
/var/www/lisausa/shared/bundle/jruby/1.9/gems/activerecord-jdbc-adapter-1.1.1/lib/arjdbc/jdbc/adapter.rb:275:in `select'
/var/www/lisausa/shared/bundle/jruby/1.9/gems/activerecord-3.0.10/lib/active_record/connection_adapters/abstract/database_statements.rb:7:in `select_all'
/var/www/lisausa/shared/bundle/jruby/1.9/gems/activerecord-3.0.10/lib/active_record/connection_adapters/abstract/query_cache.rb:56:in `select_all'
/var/www/lisausa/shared/bundle/jruby/1.9/gems/activerecord-3.0.10/lib/active_record/base.rb:473:in `find_by_sql'
/var/www/lisausa/shared/bundle/jruby/1.9/gems/activerecord-3.0.10/lib/active_record/relation.rb:64:in `to_a'
/var/www/lisausa/shared/bundle/jruby/1.9/gems/activerecord-3.0.10/lib/active_record/relation/finder_methods.rb:143:in `all'

我尝试过使用ruby-cloudfiles gem而不是Fog,但我们似乎也开始遇到完全相同的错误,最终也使用了这个组合。如果我禁用文件下载/ cloudfiles上传部分,这些错误永远不会出现,我已经能够让这个特定的工作运行了好几天。

关于这里可能发生什么的任何理论?

1 个答案:

答案 0 :(得分:0)

对于它的价值,几个月后,我可以告诉你

“ActiveRecord :: JDBCError:无法读取服务器的响应。预计读取4个字节,在连接意外丢失之前读取0个字节......”

通常是我们的一个指标,表明数据库连接已被破坏(由于网络问题,或者DB崩溃等),底层套接字已关闭。至少对我们来说,JRuby / ActiveRecord / ConnectionPool的默认行为是在每个命令上不断重写这个错误,而不是重新连接。