如何解决Hypnotoad工作者失败问题

时间:2015-04-01 21:52:26

标签: mojolicious

我有一个由Hypnotoad提供的应用程序,没有反向代理。它有15个工作人员,每个允许2个客户端。该应用程序通过前景模式下的hypnotoad启动。

我在log / production.log中看到以下内容:

[Wed Apr  1 16:28:12 2015] [error] Worker 119914 has no heartbeat, restarting.
[Wed Apr  1 16:28:21 2015] [error] Worker 119910 has no heartbeat, restarting.
[Wed Apr  1 16:28:21 2015] [error] Worker 119913 has no heartbeat, restarting.
[Wed Apr  1 16:28:22 2015] [error] Worker 119917 has no heartbeat, restarting.
[Wed Apr  1 16:28:22 2015] [error] Worker 119909 has no heartbeat, restarting.
[Wed Apr  1 16:28:27 2015] [error] Worker 119907 has no heartbeat, restarting.
[Wed Apr  1 16:28:34 2015] [error] Worker 119905 has no heartbeat, restarting.
[Wed Apr  1 16:28:42 2015] [error] Worker 119904 has no heartbeat, restarting.
[Wed Apr  1 16:30:12 2015] [error] Worker 119912 has no heartbeat, restarting.
[Wed Apr  1 16:31:23 2015] [error] Worker 119918 has no heartbeat, restarting.
[Wed Apr  1 16:32:18 2015] [error] Worker 119911 has no heartbeat, restarting.
[Wed Apr  1 16:32:22 2015] [error] Worker 119916 has no heartbeat, restarting.

然而,工人们从未重新开始工作。

当我运行一个strace时,经理进程似乎勇敢地试图杀死(现已过期的)工人:

Process 119878 attached - interrupt to quit
restart_syscall(<... resuming interrupted call ...>) = 0
kill(119906, SIGKILL)                   = 0
kill(119917, SIGKILL)                   = 0
kill(119905, SIGKILL)                   = 0
kill(119910, SIGKILL)                   = 0
kill(119904, SIGKILL)                   = 0
kill(119914, SIGKILL)                   = 0
kill(119916, SIGKILL)                   = 0
kill(119908, SIGKILL)                   = 0
kill(119913, SIGKILL)                   = 0
kill(119915, SIGKILL)                   = 0
kill(119918, SIGKILL)                   = 0
kill(119912, SIGKILL)                   = 0
kill(119909, SIGKILL)                   = 0
kill(119911, SIGKILL)                   = 0
kill(119907, SIGKILL)                   = 0
stat("/xxx/xxx/xxx/hypnotoad.pid", {st_mode=S_IFREG|0644, st_size=6, ...}) = 0
poll([{fd=4, events=POLLIN|POLLPRI}], 1, 1000) = 0 (Timeout)
kill(119906, SIGKILL)                   = 0
kill(119917, SIGKILL)                   = 0
kill(119905, SIGKILL)                   = 0
kill(119910, SIGKILL)                   = 0
kill(119904, SIGKILL)                   = 0
kill(119914, SIGKILL)                   = 0
kill(119916, SIGKILL)                   = 0
kill(119908, SIGKILL)                   = 0
kill(119913, SIGKILL)                   = 0
kill(119915, SIGKILL)                   = 0
kill(119918, SIGKILL)                   = 0
kill(119912, SIGKILL)                   = 0
kill(119909, SIGKILL)                   = 0
kill(119911, SIGKILL)                   = 0
kill(119907, SIGKILL)                   = 0
stat("/xxx/xxx/xxx/hypnotoad.pid", {st_mode=S_IFREG|0644, st_size=6, ...}) = 0
poll([{fd=4, events=POLLIN|POLLPRI}], 1, 1000^C <unfinished ...>
Process 119878 detached

如何进一步排查以确定:

  1. 为什么Hypnotoad认为它仍然需要杀死不存在的 流程?
  2. 为什么不开始新的?

1 个答案:

答案 0 :(得分:2)

  

&#34;工作者31842没有心跳,重启&#34;意味着吗

     

只要他们接受新的连接,所有内置的preforking Web服务器的工作进程会定期向管理器进程发送心跳消息,以表示它们仍然响应。应用程序中的无限循环等阻塞操作可以防止这种情况,并会在超时后强制重新启动受影响的工作程序。此超时默认为20秒,可以使用属性&#34; heartbeat_timeout&#34;进行扩展。在Mojo :: Server :: Prefork中,如果您的应用程序需要它。

http://mojolicio.us/perldoc/Mojolicious/Guides/FAQ#What-does-Worker-31842-has-no-heartbeat-restarting-mean