我一直在努力解决这个问题并且无法解决这个问题。我正在尝试让Redis和Sidekiq处理我在Cloud66 w / Digital Ocean上托管的Rails项目的后台工作。所有需要的宝石似乎都存在,并且设置在本地完美运行。
我的第一次尝试是使用这些设置:
这是我的config / sidekiq.yaml文件:
---
:concurrency: 25
:pidfile: ./tmp/pids/sidekiq.pid
:logfile: ./log/sidekiq.log
:queues:
- default
- [high_priority, 2]
:daemon: true
根据本教程https://mikecoutermarsh.com/setting-up-redis-on-cloud66-for-sidekiq/,这是我的内容Procfile:
worker: env RAILS_ENV=$RAILS_ENV REDIS_URL=$REDIS_URL_INT bundle exec sidekiq -C config/sidekiq.yml
$ REDIT_URL_INT是:redis://104.236.131.187:6379
的ENV变量。根据博客帖子的评论中的建议,此ENV变量与教程中的变量(包括端口)不同。
使用这些设置部署后,我的Sidekiq日志为我提供了以下内容:
2015-05-16T16:19:44.732Z 14636 TID-1g96vc INFO: Booting Sidekiq 3.3.2 with redis options {:url=>"redis://104.236.131.187:6379"}
2015-05-16T16:20:13.801Z 14701 TID-3trg0 INFO: Running in ruby 2.1.5p273 (2014-11-13 revision 48405) [x86_64-linux]
2015-05-16T16:20:13.823Z 14701 TID-3trg0 INFO: See LICENSE and the LGPL-3.0 for licensing details.
2015-05-16T16:20:13.823Z 14701 TID-3trg0 INFO: Upgrade to Sidekiq Pro for more features and support: http://sidekiq.org/pro
2015-05-16T16:20:15.167Z 14701 TID-18nsv4 INFO: Booting Sidekiq 3.3.2 with redis options {:url=>"redis://104.236.131.187:6379"}
2015-05-16T16:20:15.180Z 14701 TID-7791g INFO: Booting Sidekiq 3.3.2 with redis options {:url=>"redis://104.236.131.187:6379"}
2015-05-16T16:20:32.065Z 14753 TID-6uz3g INFO: Running in ruby 2.1.5p273 (2014-11-13 revision 48405) [x86_64-linux]
2015-05-16T16:20:32.066Z 14753 TID-6uz3g INFO: See LICENSE and the LGPL-3.0 for licensing details.
2015-05-16T16:20:32.066Z 14753 TID-6uz3g INFO: Upgrade to Sidekiq Pro for more features and support: http://sidekiq.org/pro
2015-05-16T16:20:32.129Z 14753 TID-1bl0r0 INFO: Booting Sidekiq 3.3.2 with redis options {:url=>"redis://104.236.131.187:6379"}
2015-05-16T16:20:54.584Z 14852 TID-5t1rs INFO: Running in ruby 2.1.5p273 (2014-11-13 revision 48405) [x86_64-linux]
2015-05-16T16:20:54.585Z 14852 TID-5t1rs INFO: See LICENSE and the LGPL-3.0 for licensing details.
2015-05-16T16:20:54.585Z 14852 TID-5t1rs INFO: Upgrade to Sidekiq Pro for more features and support: http://sidekiq.org/pro
2015-05-16T16:20:54.665Z 14852 TID-1aj3m0 INFO: Booting Sidekiq 3.3.2 with redis options {:url=>"redis://104.236.131.187:6379"}
给我的印象是Sidekiq一直在重启。所以我查看了Sidekiq流程:
12747 ? Sl 0:10 sidekiq 3.3.2 web_head [0 of 25 busy]
13540 ? Sl 0:07 sidekiq 3.3.2 web_head [0 of 25 busy]
13596 ? Sl 0:08 sidekiq 3.3.2 web_head [0 of 25 busy]
13650 ? Sl 0:06 sidekiq 3.3.2 web_head [0 of 25 busy]
13702 ? Sl 0:06 sidekiq 3.3.2 web_head [0 of 25 busy]
13758 ? Sl 0:07 sidekiq 3.3.2 web_head [0 of 25 busy]
13818 ? Sl 0:07 sidekiq 3.3.2 web_head [0 of 25 busy]
13869 ? Sl 0:07 sidekiq 3.3.2 web_head [0 of 25 busy]
13934 ? Sl 0:07 sidekiq 3.3.2 web_head [0 of 25 busy]
13986 ? Sl 0:07 sidekiq 3.3.2 web_head [0 of 25 busy]
14089 ? Sl 0:06 sidekiq 3.3.2 web_head [0 of 25 busy]
14144 ? Sl 0:06 sidekiq 3.3.2 web_head [0 of 25 busy]
14196 ? Sl 0:06 sidekiq 3.3.2 web_head [0 of 25 busy]
14259 ? Sl 0:06 sidekiq 3.3.2 web_head [0 of 25 busy]
14311 ? Sl 0:06 sidekiq 3.3.2 web_head [0 of 25 busy]
14363 ? Sl 0:05 sidekiq 3.3.2 web_head [0 of 25 busy]
14421 ? Sl 0:05 sidekiq 3.3.2 web_head [0 of 25 busy]
14474 ? Sl 0:07 sidekiq 3.3.2 web_head [0 of 25 busy]
14530 ? Sl 0:05 sidekiq 3.3.2 web_head [0 of 25 busy]
14585 ? Sl 0:05 sidekiq 3.3.2 web_head [0 of 25 busy]
14636 ? Sl 0:05 sidekiq 3.3.2 web_head [0 of 25 busy]
14701 ? Sl 0:05 sidekiq 3.3.2 web_head [0 of 25 busy]
14753 ? Sl 0:05 sidekiq 3.3.2 web_head [0 of 25 busy]
14852 ? Sl 0:05 sidekiq 3.3.2 web_head [0 of 25 busy]
14913 ? Sl 0:04 sidekiq 3.3.2 web_head [0 of 25 busy]
14966 ? Sl 0:04 sidekiq 3.3.2 web_head [0 of 25 busy]
15023 ? Sl 0:04 sidekiq 3.3.2 web_head [0 of 25 busy]
很多Sidekiq行动!我没有要求这个。我只需要一个。
我目前的理论是,我错过了Rails / Sidekiq / Redis设置之间的链接。所以我添加了一个Redis config / redis / production.conf:
daemonize yes
port 6379
logfile ./log/redis_production.log
dbfilename ./db/production.rdb
这没有任何区别。此外,没有创建redis_production.log或production.rbd。所以我猜cloud66正在处理Redis部分。如果我签出webconsole,redis服务器正在正确的端口上运行。
我相信Cloud66使用Bluepil来管理他们的流程。有以下日志文件名为user_worker_pill.log:
I, [2015-05-16T16:28:27.157623 #11066] INFO -- : [user_worker:worker:user_worker_1] Going from down => starting
E, [2015-05-16T16:28:47.183939 #11066] ERROR -- : [user_worker:worker:user_worker_1] Failed to signal process 16244 with code 0: No such process
E, [2015-05-16T16:28:47.185674 #11066] ERROR -- : [user_worker:worker:user_worker_1] Failed to signal process 16244 with code 0: No such process
I, [2015-05-16T16:28:47.618515 #11066] INFO -- : [user_worker:worker:user_worker_1] Going from starting => down
E, [2015-05-16T16:28:48.627548 #11066] ERROR -- : [user_worker:worker:user_worker_1] Failed to signal process 16244 with code 0: No such process
E, [2015-05-16T16:28:48.629944 #11066] ERROR -- : [user_worker:worker:user_worker_1] Failed to signal process 16244 with code 0: No such process
D, [2015-05-16T16:28:48.991312 #11066] DEBUG -- : [user_worker] pid journal file: /var/run/bluepill/journals/.bluepill_pids_journal.user_worker_1
D, [2015-05-16T16:28:48.993154 #11066] DEBUG -- : [user_worker] pid journal = 16244
D, [2015-05-16T16:28:48.993257 #11066] DEBUG -- : [user_worker] Acquired lock /var/run/bluepill/journals/.bluepill_pids_journal.user_worker_1.lock
D, [2015-05-16T16:28:48.993396 #11066] DEBUG -- : [user_worker] Unable to term missing process 16244
D, [2015-05-16T16:28:48.993535 #11066] DEBUG -- : [user_worker] Journal cleanup completed
D, [2015-05-16T16:28:48.993595 #11066] DEBUG -- : [user_worker] Cleared lock /var/run/bluepill/journals/.bluepill_pids_journal.user_worker_1.lock
D, [2015-05-16T16:28:48.993654 #11066] DEBUG -- : [user_worker] pgid journal file: /var/run/bluepill/journals/.bluepill_pgids_journal.user_worker_1
D, [2015-05-16T16:28:48.993829 #11066] DEBUG -- : [user_worker] pgid journal = 16241
D, [2015-05-16T16:28:48.993901 #11066] DEBUG -- : [user_worker] Acquired lock /var/run/bluepill/journals/.bluepill_pgids_journal.user_worker_1.lock
D, [2015-05-16T16:28:48.993994 #11066] DEBUG -- : [user_worker] Unable to term missing process group 16241
D, [2015-05-16T16:28:48.995031 #11066] DEBUG -- : [user_worker] Journal cleanup completed
D, [2015-05-16T16:28:48.995180 #11066] DEBUG -- : [user_worker] Cleared lock /var/run/bluepill/journals/.bluepill_pgids_journal.user_worker_1.lock
W, [2015-05-16T16:28:48.995344 #11066] WARN -- : [user_worker:worker:user_worker_1] Executing start command: env RAILS_ENV=production REDIS_URL=redis://104.236.131.187:6379 bundle exec sidekiq -C config/sidekiq.yml
D, [2015-05-16T16:28:49.457935 #11066] DEBUG -- : [user_worker] Acquired lock /var/run/bluepill/journals/.bluepill_pgids_journal.user_worker_1.lock
D, [2015-05-16T16:28:49.458693 #11066] DEBUG -- : [user_worker] pgid journal file: /var/run/bluepill/journals/.bluepill_pgids_journal.user_worker_1
D, [2015-05-16T16:28:49.459430 #11066] DEBUG -- : [user_worker] Saving pgid 16296 to process journal user_worker_1
I, [2015-05-16T16:28:49.459854 #11066] INFO -- : [user_worker] Saved pgid 16296 to journal user_worker_1
D, [2015-05-16T16:28:49.460220 #11066] DEBUG -- : [user_worker] Journal now = 16296
D, [2015-05-16T16:28:49.460454 #11066] DEBUG -- : [user_worker] Cleared lock /var/run/bluepill/journals/.bluepill_pgids_journal.user_worker_1.lock
D, [2015-05-16T16:28:49.460656 #11066] DEBUG -- : [user_worker] Acquired lock /var/run/bluepill/journals/.bluepill_pids_journal.user_worker_1.lock
D, [2015-05-16T16:28:49.460901 #11066] DEBUG -- : [user_worker] pid journal file: /var/run/bluepill/journals/.bluepill_pids_journal.user_worker_1
D, [2015-05-16T16:28:49.461174 #11066] DEBUG -- : [user_worker] Saving pid 16299 to process journal user_worker_1
I, [2015-05-16T16:28:49.462289 #11066] INFO -- : [user_worker] Saved pid 16299 to journal user_worker_1
D, [2015-05-16T16:28:49.462563 #11066] DEBUG -- : [user_worker] Journal now = 16299
D, [2015-05-16T16:28:49.462916 #11066] DEBUG -- : [user_worker] Cleared lock /var/run/bluepill/journals/.bluepill_pids_journal.user_worker_1.lock
这超出了我在这个问题上的有限专业知识,但在我看来,它试图使用Procfile中的命令反复重新启动崩溃的进程。
这是我能够收集的所有信息,我不知道如何继续。我真的非常感谢任何见解,意见或建议。
谢谢!
/ EDIT
在Phillip的评论之后,我将$ REDIS_URL_INT更改为$ REDIT_ADDRESS(没有端口的IP),这是sidekiq.log:
2015-05-18T14:00:05.683Z 15878 TID-1dm310 ERROR: heartbeat: Waited 1 sec
2015-05-18T14:00:07.769Z 15878 TID-boxzc ERROR: Waited 1 sec
2015-05-18T14:00:07.769Z 15878 TID-boxzc ERROR: /var/deploy/gemconn/web_head/shared/bundle/ruby/2.1.0/gems/connection_pool-2.1.1/lib/connection_pool/timed_stack.rb:85:in `block (2 levels) in pop'
2015-05-18T14:00:08.770Z 15878 TID-boxzc WARN: {:context=>"scheduling poller thread died!"}
2015-05-18T14:00:08.771Z 15878 TID-boxzc WARN: Waited 1 sec
2015-05-18T14:00:08.771Z 15878 TID-boxzc WARN: /var/deploy/gemconn/web_head/shared/bundle/ruby/2.1.0/gems/connection_pool-2.1.1/lib/connection_pool/timed_stack.rb:85:in `block (2 levels) in pop'
/var/deploy/gemconn/web_head/shared/bundle/ruby/2.1.0/gems/connection_pool-2.1.1/lib/connection_pool/timed_stack.rb:77:in `loop'
/var/deploy/gemconn/web_head/shared/bundle/ruby/2.1.0/gems/connection_pool-2.1.1/lib/connection_pool/timed_stack.rb:77:in `block in pop'
/var/deploy/gemconn/web_head/shared/bundle/ruby/2.1.0/gems/connection_pool-2.1.1/lib/connection_pool/timed_stack.rb:76:in `synchronize'
/var/deploy/gemconn/web_head/shared/bundle/ruby/2.1.0/gems/connection_pool-2.1.1/lib/connection_pool/timed_stack.rb:76:in `pop'
/var/deploy/gemconn/web_head/shared/bundle/ruby/2.1.0/gems/connection_pool-2.1.1/lib/connection_pool.rb:78:in `checkout'
/var/deploy/gemconn/web_head/shared/bundle/ruby/2.1.0/gems/connection_pool-2.1.1/lib/connection_pool.rb:60:in `with'
/var/deploy/gemconn/web_head/shared/bundle/ruby/2.1.0/gems/sidekiq-3.3.2/lib/sidekiq.rb:74:in `redis'
/var/deploy/gemconn/web_head/shared/bundle/ruby/2.1.0/gems/sidekiq-3.3.2/lib/sidekiq/api.rb:634:in `cleanup'
/var/deploy/gemconn/web_head/shared/bundle/ruby/2.1.0/gems/sidekiq-3.3.2/lib/sidekiq/api.rb:627:in `initialize'
/var/deploy/gemconn/web_head/shared/bundle/ruby/2.1.0/gems/sidekiq-3.3.2/lib/sidekiq/scheduled.rb:87:in `new'
/var/deploy/gemconn/web_head/shared/bundle/ruby/2.1.0/gems/sidekiq-3.3.2/lib/sidekiq/scheduled.rb:87:in `poll_interval'
/var/deploy/gemconn/web_head/shared/bundle/ruby/2.1.0/gems/sidekiq-3.3.2/lib/sidekiq/scheduled.rb:66:in `block in poll'
/var/deploy/gemconn/web_head/shared/bundle/ruby/2.1.0/gems/sidekiq-3.3.2/lib/sidekiq/util.rb:16:in `watchdog'
/var/deploy/gemconn/web_head/shared/bundle/ruby/2.1.0/gems/sidekiq-3.3.2/lib/sidekiq/scheduled.rb:51:in `poll'
/var/deploy/gemconn/web_head/shared/bundle/ruby/2.1.0/gems/celluloid-0.16.0/lib/celluloid/calls.rb:26:in `public_send'
/var/deploy/gemconn/web_head/shared/bundle/ruby/2.1.0/gems/celluloid-0.16.0/lib/celluloid/calls.rb:26:in `dispatch'
/var/deploy/gemconn/web_head/shared/bundle/ruby/2.1.0/gems/celluloid-0.16.0/lib/celluloid/calls.rb:122:in `dispatch'
/var/deploy/gemconn/web_head/shared/bundle/ruby/2.1.0/gems/celluloid-0.16.0/lib/celluloid/cell.rb:60:in `block in invoke'
/var/deploy/gemconn/web_head/shared/bundle/ruby/2.1.0/gems/celluloid-0.16.0/lib/celluloid/cell.rb:71:in `block in task'
/var/deploy/gemconn/web_head/shared/bundle/ruby/2.1.0/gems/celluloid-0.16.0/lib/celluloid/actor.rb:357:in `block in task'
/var/deploy/gemconn/web_head/shared/bundle/ruby/2.1.0/gems/celluloid-0.16.0/lib/celluloid/tasks.rb:57:in `block in initialize'
/var/deploy/gemconn/web_head/shared/bundle/ruby/2.1.0/gems/celluloid-0.16.0/lib/celluloid/tasks/task_fiber.rb:15:in `block in create'
2015-05-18T14:00:08.774Z 15878 TID-1dm5j0 WARN: Sidekiq died due to the following error, cannot recover, process exiting
2015-05-18T14:00:08.775Z 15878 TID-1dm5j0 WARN: Waited 1 sec
2015-05-18T14:00:08.776Z 15878 TID-1dm5j0 WARN: /var/deploy/gemconn/web_head/shared/bundle/ruby/2.1.0/gems/connection_pool-2.1.1/lib/connection_pool/timed_stack.rb:85:in `block (2 levels) in pop'
答案 0 :(得分:1)
我正在添加另一个答案,以使此解决方案更清晰。我仔细看了一下,你的Sidekiq配置实际上是守护进程,而进程应该在前台运行以便我们控制它们。这就是为什么你看到这么多Sidekiq进程运行的原因 - 我们的bluepill会启动一个,认为它没有出现,所以开始更多。
如果从sidekiq.yml中删除:daemon: true
并重新部署,则应该可以解决问题。
答案 1 :(得分:0)
重复的消息可能是因为sidekiq无法连接到Redis。您确定要在$ REDIS_URL_INT中使用公共IP吗?如果是这样,您是否允许访问正确的端口?如果它们在同一个盒子上,也许可以使用0.0.0.0或类似的。
答案 2 :(得分:0)
在外部IP地址上连接到Redis服务器不应该是一个问题(在防火墙设置的情况下),但如果您通过SSH连接到服务器,是否可以手动运行此命令以查看它输出的内容?在这种情况下,您还可以直接设置连接参数,这样可以更轻松地进行故障排除。我的设置中没有发现任何明显错误。
顺便说一句,将(?s)
设置为外部IP地址的原因是DigitalOcean SF不支持私有网络。他们现在这样做(虽然他们没有公布这一变化),所以我们也会在我们这边做更新。