分叉时发生resque worker错误[Ruby,Redis]

时间:2015-08-26 17:44:27

标签: ruby redis rake resque

我在处理入队的resque任务时遇到了困难。 整个排队部分进展顺利 - 我可以在Redis中看到它,并且Resque.info显示待处理的任务数量按预期递增。

如果我明确地运行Job类的perform方法 - 一切正常。

一旦工人活着 - 所有任务都失败了。

这是Job类:

class TestJob
  @queue = :test1_queue

  def self.perform(str)
    begin
      f = open('/tmp/test.txt', 'a')
      f.write("#{Time.now.to_s} #{str} in self.perform\n")
    rescue Exception => e
      f.write("#{Time.now.to_s} #{str} in self.perform\n#{e.message}\n#{e.backtrace}\n")
    ensure
      f.close
    end
  end  
end

resque.rb初始值设定项:

require 'resque'
require 'redis'
Dir['../../jobs'].each { |file| require file }

Resque.redis = $resque_redis
Resque.logger.level = Logger::WARN
Resque.after_fork do |_|
  $resque_redis.client.reconnect
end

Redis初始化程序:

require 'redis'
$resque_redis = Redis.new(:host => REDIS_BITMAP_HOST,  :port => REDIS_PORT, :db => 0, :timeout => 30)

配置文件以上帝启动工作人员:

require_relative './common.rb'
watch_resque_process('test1', 1)

上帝的定义:

$home_dir = ENV["HOME"]
$rack_env = ENV["ETL_ENV"] || "development"

def create_deafult_monitoring_scheme(watch)
  # Restart if memory is above 150 Megabytes or CPU is above 50% for 5 consecutive intervals
  watch.restart_if do |restart|
    restart.condition(:memory_usage) do |c|
      c.above = 150.megabytes
      c.times = [3, 5] # 3 out of 5 intervals
    end

    restart.condition(:cpu_usage) do |c|
      c.above = 50.percent
      c.times = 5
    end
  end

  # The :flapping condition guards against the edge case wherein god rapidly starts or restarts your application.
  # If this watch is started or restarted five times withing 5 minutes, then unmonitor it for 10 minutes.
  # After 10 minutes, monitor it again to see if it was just a temporary problem; if the process is seen to be flapping five times within two hours, then give up completely.
  watch.lifecycle do |on|
    on.condition(:flapping) do |c|
      c.to_state      = [:start, :restart]
      c.times         = 5
      c.within        = 5.minute
      c.transition    = :unmonitored
      c.retry_in      = 10.minute
      c.retry_times   = 5
      c.retry_within  = 30.minute
    end
  end
end

def watch_resque_process(resque_process_name, worker_count=8)
  God.watch do |w|
    w.name      = "resque_work-#{resque_process_name}"
    w.start     = "cd #{$home_dir}/rtb-etl && COUNT=#{worker_count} QUEUE='#{resque_process_name}_queue' RACK_ENV=#{$rack_env} rake resque:workers"
    w.interval  = 30.seconds
    w.log       = File.join($home_dir, 'logs', 'resque', "resque_#{resque_process_name}.log")
    w.err_log   = File.join($home_dir, 'logs', 'resque', "resque_#{resque_process_name}.log")
    w.env       = { 'PIDFILE' => "#{$home_dir}/pids/#{w.name}.pid" }

    # Check if the process is still up every 5 seconds
    w.start_if do |start|
      start.condition(:process_running) do |c|
        c.interval = 5.seconds
        c.running = false
      end
    end

    create_deafult_monitoring_scheme(w)

  end
end

def watch_rake_task(rake_task_name, interval=30.seconds)
  God.watch do |w|
    w.name  = "rake_#{rake_task_name}"
    # w.start = "cd #{$home_dir}/rtb-etl && RACK_ENV=#{$rack_env} bundle exec rake #{rake_task_name}"
    w.start = "cd #{$home_dir}/rtb-etl && RACK_ENV=#{$rack_env} rake #{rake_task_name}"
    w.interval = interval
    w.log      = File.join($home_dir, 'logs', 'resque', "rake_#{rake_task_name}.log")
    w.err_log  = File.join($home_dir, 'logs', 'resque', "rake_#{rake_task_name}.log")
    w.env      = { 'PIDFILE' => "#{$home_dir}/pids/#{w.name}.pid" }

    # Check if the process is still up every 30 seconds
    w.start_if do |start|
      start.condition(:process_running) do |c|
        c.interval = interval
        c.running = false
      end
    end

    create_deafult_monitoring_scheme(w)

  end
end

当我执行以下操作时:     IRB(主):004:0> Resque.enqueue(TestJob,' foo')     =>真

为了检查出了什么问题,我在下面运行:

Resque::Failure.all(0,20).each { |job|
   puts "#{job["exception"]}  #{job["backtrace"]}"
}

得到这个结果:

[{"failed_at"=>"2015/08/26 17:35:00 UTC", 
"payload"=>{"class"=>"TestJob", "args"=>["foo"]},     
"exception"=>"NoMethodError", 
"error"=>"undefined method `client' for nil:NilClass", 
"backtrace"=>[], "worker"=>"ip-172-31-11-211:5006:test1_queue",
"queue"=>"test1_queue"}]

有什么想法吗?

0 个答案:

没有答案