我有一个由Hypnotoad提供的应用程序,没有反向代理。它有15个工作人员,每个允许2个客户端。该应用程序通过前景模式下的hypnotoad启动。
我在log / production.log中看到以下内容:
[Wed Apr 1 16:28:12 2015] [error] Worker 119914 has no heartbeat, restarting.
[Wed Apr 1 16:28:21 2015] [error] Worker 119910 has no heartbeat, restarting.
[Wed Apr 1 16:28:21 2015] [error] Worker 119913 has no heartbeat, restarting.
[Wed Apr 1 16:28:22 2015] [error] Worker 119917 has no heartbeat, restarting.
[Wed Apr 1 16:28:22 2015] [error] Worker 119909 has no heartbeat, restarting.
[Wed Apr 1 16:28:27 2015] [error] Worker 119907 has no heartbeat, restarting.
[Wed Apr 1 16:28:34 2015] [error] Worker 119905 has no heartbeat, restarting.
[Wed Apr 1 16:28:42 2015] [error] Worker 119904 has no heartbeat, restarting.
[Wed Apr 1 16:30:12 2015] [error] Worker 119912 has no heartbeat, restarting.
[Wed Apr 1 16:31:23 2015] [error] Worker 119918 has no heartbeat, restarting.
[Wed Apr 1 16:32:18 2015] [error] Worker 119911 has no heartbeat, restarting.
[Wed Apr 1 16:32:22 2015] [error] Worker 119916 has no heartbeat, restarting.
然而,工人们从未重新开始工作。
当我运行一个strace时,经理进程似乎勇敢地试图杀死(现已过期的)工人:
Process 119878 attached - interrupt to quit
restart_syscall(<... resuming interrupted call ...>) = 0
kill(119906, SIGKILL) = 0
kill(119917, SIGKILL) = 0
kill(119905, SIGKILL) = 0
kill(119910, SIGKILL) = 0
kill(119904, SIGKILL) = 0
kill(119914, SIGKILL) = 0
kill(119916, SIGKILL) = 0
kill(119908, SIGKILL) = 0
kill(119913, SIGKILL) = 0
kill(119915, SIGKILL) = 0
kill(119918, SIGKILL) = 0
kill(119912, SIGKILL) = 0
kill(119909, SIGKILL) = 0
kill(119911, SIGKILL) = 0
kill(119907, SIGKILL) = 0
stat("/xxx/xxx/xxx/hypnotoad.pid", {st_mode=S_IFREG|0644, st_size=6, ...}) = 0
poll([{fd=4, events=POLLIN|POLLPRI}], 1, 1000) = 0 (Timeout)
kill(119906, SIGKILL) = 0
kill(119917, SIGKILL) = 0
kill(119905, SIGKILL) = 0
kill(119910, SIGKILL) = 0
kill(119904, SIGKILL) = 0
kill(119914, SIGKILL) = 0
kill(119916, SIGKILL) = 0
kill(119908, SIGKILL) = 0
kill(119913, SIGKILL) = 0
kill(119915, SIGKILL) = 0
kill(119918, SIGKILL) = 0
kill(119912, SIGKILL) = 0
kill(119909, SIGKILL) = 0
kill(119911, SIGKILL) = 0
kill(119907, SIGKILL) = 0
stat("/xxx/xxx/xxx/hypnotoad.pid", {st_mode=S_IFREG|0644, st_size=6, ...}) = 0
poll([{fd=4, events=POLLIN|POLLPRI}], 1, 1000^C <unfinished ...>
Process 119878 detached
如何进一步排查以确定:
答案 0 :(得分:2)
&#34;工作者31842没有心跳,重启&#34;意味着吗
只要他们接受新的连接,所有内置的preforking Web服务器的工作进程会定期向管理器进程发送心跳消息,以表示它们仍然响应。应用程序中的无限循环等阻塞操作可以防止这种情况,并会在超时后强制重新启动受影响的工作程序。此超时默认为20秒,可以使用属性&#34; heartbeat_timeout&#34;进行扩展。在Mojo :: Server :: Prefork中,如果您的应用程序需要它。