Resque worker导致ActiveRecord :: StatementInvalid:PG ::错误:SSL SYSCALL错误:检测到EOF

时间:2013-01-01 21:31:18

标签: ruby-on-rails heroku resque unicorn

我现在在两个应用上遇到了这个问题。 Heroku本身还没有得到很多帮助(还)。

我正在使用:

  • Rails 3.2.9
  • Unicorn
  • Heroku,Postgres Dev(免费)数据库和OpenRedis Micro
  • MongoDB(用于存储社交网络状态)
  • Resque
  • Resque-scheduler

使用我的本地Postgres和Redis数据库运行时,一切都很有效。

这是我的Heroku日志中的错误示例:

2013-01-01T21:17:27+00:00 app[resque_worker.1]: Found job on #<Resque::Queue:0x00000006652920>
2013-01-01T21:17:27+00:00 app[resque_worker.1]: got: (Job{facebook} | FacebookRefresh | ["facebook_key"])
2013-01-01T21:17:27+00:00 app[resque_worker.1]: Running before_fork hooks with [(Job{facebook} | FacebookRefresh | ["facebook_key"])]
2013-01-01T21:17:27+00:00 app[resque_worker.1]: Running after_fork hooks with [(Job{facebook} | FacebookRefresh | ["facebook_key"])]
2013-01-01T21:17:27+00:00 app[resque_worker.1]: resque-2.0.0.pre.1: Processing facebook since 1357075047
2013-01-01T21:17:27+00:00 app[resque_worker.1]: resque-2.0.0.pre.1: Forked 503 at 1357075047
2013-01-01T21:17:27+00:00 app[resque_worker.1]: Running before_perform hooks with [(Job{facebook} | FacebookRefresh | ["facebook_key"])]
2013-01-01T21:17:27+00:00 app[resque_worker.1]: :             SELECT a.attname, format_type(a.atttypid, a.atttypmod),
2013-01-01T21:17:27+00:00 app[resque_worker.1]:               FROM pg_attribute a LEFT JOIN pg_attrdef d
2013-01-01T21:17:27+00:00 app[resque_worker.1]:                 ON a.attrelid = d.adrelid AND a.attnum = d.adnum
2013-01-01T21:17:27+00:00 app[resque_worker.1]:                      pg_get_expr(d.adbin, d.adrelid), a.attnotnull, a.atttypid, a.atttypmod
2013-01-01T21:17:27+00:00 app[resque_worker.1]: ). Retrying...
2013-01-01T21:17:27+00:00 app[resque_worker.1]: Performing FacebookRefresh caused an exception (PG::Error: SSL SYSCALL error: EOF detected
2013-01-01T21:17:27+00:00 app[resque_worker.1]:              WHERE a.attrelid = '"facebook_accounts"'::regclass
2013-01-01T21:17:27+00:00 app[resque_worker.1]:              ORDER BY a.attnum
2013-01-01T21:17:27
+00:00 app[resque_worker.1]:                AND a.attnum > 0 AND NOT a.attisdropped
2013-01-01T21:17:27+00:00 app[resque_worker.1]: :             SELECT a.attname, format_type(a.atttypid, a.atttypmod),
2013-01-01T21:17:27+00:00 app[resque_worker.1]:               FROM pg_attribute a LEFT JOIN pg_attrdef d
2013-01-01T21:17:27+00:00 app[resque_worker.1]:                 ON a.attrelid = d.adrelid AND a.attnum = d.adnum
2013-01-01T21:17:27+00:00 app[resque_worker.1]: (Job{facebook} | FacebookRefresh | ["facebook_key"]) failed: #<ActiveRecord::StatementInvalid: PG::Error: SSL SYSCALL error: EOF detected
2013-01-01T21:17:27+00:00 app[resque_worker.1]: >
2013-01-01T21:17:27+00:00 app[resque_worker.1]:                      pg_get_expr(d.adbin, d.adrelid), a.attnotnull, a.atttypid, a.atttypmod
2013-01-01T21:17:27+00:00 app[resque_worker.1]:                AND a.attnum > 0 AND NOT a.attisdropped
2013-01-01T21:17:27+00:00 app[resque_worker.1]:              WHERE a.attrelid = '"facebook_accounts"'::regclass
2013-01-01T21:17:27+00:00 app[resque_worker.1]:              ORDER BY a.attnum
2013-01-01T21:17:27+00:00 app[resque_worker.1]: Running before_fork hooks with [(Job{facebook} | FacebookRefresh | ["facebook_key"])]

我在unicorn的配置文件中尝试了很多before_hook和after_hook,但是它们似乎都没有帮助。

# What the timeout for killing busy workers is, in seconds
timeout 60

# Whether the app should be pre-loaded
preload_app true

# How many worker processes
worker_processes 3

before_fork do |server, worker|
  # Replace with MongoDB or whatever
  if defined?(ActiveRecord::Base)
    ActiveRecord::Base.connection.disconnect!
    Rails.logger.info('Disconnected from ActiveRecord')
  end

  # If you are using Redis but not Resque, change this
  if defined?(Resque)
    Resque.redis.quit
    Rails.logger.info('Disconnected from Redis')
  end

  sleep 1
end

after_fork do |server, worker|
  if defined?(ActiveRecord::Base)
    ActiveRecord::Base.establish_connection
    Rails.logger.info('Connected to ActiveRecord')
  end

  if defined?(Resque)
    Resque.redis = ENV['OPENREDIS_URL'] || 'redis://localhost:6379'
    Rails.logger.info('Connected to Redis')
  end
end

我的Procfile

web: bundle exec unicorn -c lib/unicorn/config.rb -p $PORT
resque_scheduler: env bundle exec rake resque:scheduler
resque_worker: env QUEUE=* bundle exec rake environment resque:work

所以我想知道的一件事是我的resque_worker根本不使用Unicorn配置,并且因为它运行在一个完全独立的Heroku工作者身上,我不确定是否有任何方式它会知道这些东西。网络实例就像调度程序一样好。它只是resque_worker在每个postgres电话爆炸。

我没有从工作人员那里做任何特别疯狂的数据库调用。一个例子可能是:

def queue_users_for_refresh
  FacebookAccount.all.each do |x|
    Resque.enqueue(FacebookAccountRefresh, x.username)
  end
end

另一个后来(在FacebookAccountRefresh中)是:

FacebookAccount.where(:username => user).first

1 个答案:

答案 0 :(得分:5)

这看起来像是错误地在进程间共享数据库连接而导致的错误。当Resque worker在分叉后没有重新初始化数据库连接时会发生这种情况。

你有Resque初始化器吗?看起来你缺少对Resque工作者的after_fork指令来匹配你的Unicorn应用服务器工作者中的指令。

添加/编辑您的Resque初始化文件(即:config / initializers / resque.rb):

Resque.after_fork = Proc.new { ActiveRecord::Base.establish_connection }