延迟作业作业表锁定

时间:2014-02-02 01:59:23

标签: ruby-on-rails-3 delayed-job postgresql-9.1

更新3
嗯,...我正在使用旧的旧版本delayed_job_active_record。一切都很好。

我已经在我正在使用的API中实现了推送通知服务。我使用延迟作业处理推送通知。我的问题是,有时似乎工作进程没有获得作业表上的锁定。也就是说,2名工人有时会接受相同的工作。我无法一致地重现这个问题,但我想知道是否有其他人经历过这个问题?以下是排队作业的代码:

Device.where("platform = ? AND enabled = ?", 'ios', true ).find_in_batches( batch_size: 2000 ) do |batch| 
    Delayed::Job.enqueue APNWorker.new( params[:push_notification], batch )
end

设备是包含移动设备令牌的表。使用Foreman在本地进行测试。

更新1
以下是Foreman的一些输出

13:10:41 worker.1 | started with pid 2489
13:10:41 worker.2 | started with pid 2492
13:10:41 worker.3 | started with pid 2495

然后,当我使用上面的代码排队时,有时候,我会得到

13:15:55 worker.1 | work
13:15:55 worker.3 | work

此处,“工作”表示正在执行作业。我收到一个重复的推送通知。如果我检查delayed_jobs表,我只看到一个锁定的工作。仍然有2名工人正在接收它。

更新2
以下是Rails的一些日志

Delayed::Backend::ActiveRecord::Job Load (1.1ms)  SELECT "delayed_jobs".* FROM "delayed_jobs" WHERE ((run_at <= '2014-02-02 17:42:37.813835' AND (locked_at IS NULL OR locked_at < '2014-02-02 13:42:37.813853') OR locked_by = 'host:positive-definite-fakta-vbox pid:4114') AND failed_at IS NULL) ORDER BY priority ASC, run_at ASC LIMIT 1
Delayed::Backend::ActiveRecord::Job Load (4.8ms)  SELECT "delayed_jobs".* FROM "delayed_jobs" WHERE ((run_at <= '2014-02-02 17:42:37.772102' AND (locked_at IS NULL OR locked_at < '2014-02-02 13:42:37.772130') OR locked_by = 'host:positive-definite-fakta-vbox pid:4118') AND failed_at IS NULL) ORDER BY priority ASC, run_at ASC LIMIT 1
(0.1ms)  BEGIN
Delayed::Backend::ActiveRecord::Job Load (0.7ms)  SELECT "delayed_jobs".* FROM "delayed_jobs" WHERE "delayed_jobs"."id" = $1 LIMIT 1 FOR UPDATE  [["id", 537]]
(0.4ms)  UPDATE "delayed_jobs" SET "locked_at" = '2014-02-02 17:42:37.844545', "locked_by" = 'host:positive-definite-fakta-vbox pid:4118', "updated_at" = '2014-02-02 17:42:37.954756' WHERE "delayed_jobs"."id" = 537
(0.6ms)  COMMIT
(3.0ms)  BEGIN
Delayed::Backend::ActiveRecord::Job Load (7.0ms)  SELECT "delayed_jobs".* FROM "delayed_jobs" WHERE "delayed_jobs"."id" = $1 LIMIT 1 FOR UPDATE  [["id", 537]]
(0.4ms)  UPDATE "delayed_jobs" SET "locked_at" = '2014-02-02 17:42:37.869191', "locked_by" = 'host:positive-definite-fakta-vbox pid:4114', "updated_at" = '2014-02-02 17:42:37.997562' WHERE "delayed_jobs"."id" = 537
(0.8ms)  COMMIT
Device Load (0.6ms)  SELECT "devices".* FROM "devices" WHERE "devices"."id" = $1 LIMIT 1  [["id", "18"]]
Device Load (0.6ms)  SELECT "devices".* FROM "devices" WHERE "devices"."id" = $1 LIMIT 1  [["id", "18"]]

可以看出,两个工作人员都可以完成这项工作('设备负载......'是实际工作)。

在delayed_jobs表中有一个条目,由以下方式锁定:

host:positive-definite-fakta-vbox pid:4114

我真正得到的是,上述情况似乎是一种完全正常,非常可能的情况。唯一发生的事情就是两个工人几乎在同一时间轮询job_queue。我觉得没什么奇怪的......但当然结果是灾难性的 为什么选择更新语句:

Delayed::Backend::ActiveRecord::Job Load (0.7ms)  SELECT "delayed_jobs".* FROM "delayed_jobs" WHERE "delayed_jobs"."id" = $1 LIMIT 1 FOR UPDATE  [["id", 537]]

是否未检查作业是否已锁定?常规队列轮询似乎就是这样做的。

0 个答案:

没有答案