我尝试过的事情

Question

这个问题被问了很多遍，但是没有一个答案有帮助。经过数小时的挖掘，我正在这里寻求帮助。我是一名系统管理员经验有限的开发人员，但是由于我们的操作人员不在了，所以我被留下来尝试使事情保持活力。

我们最近在一个网站上开始随机出现502个错误。这种情况相当定期发生，每天至少发生十几次（如nagios以及我们的用户所报道）。我不知道任何配置更改。 Web堆栈是标准的-nginx服务器将请求代理到php-fpm，后者运行基于wordpress的应用程序。

nginx错误日志包含很多这样的消息：

[error] 31180#31180: *451395 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: x.x.x.x, server: x.x, request: "GET /x/x/ HTTP/1.0", upstream: "fastcgi://127.0.0.1:9000", host: "x.x.x"

其中大多数来自客户端IP，即服务器本身的IP（不确定为什么，也许需要监视吗？），但是随机公共IP也存在错误。

PHP-FPM日志大约每小时发出一次这样的警告：

WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 8 children, there are 0 idle, and 71 total children
WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 16 children, there are 0 idle, and 75 total children
WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 32 children, there are 0 idle, and 79 total children
WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 32 children, there are 0 idle, and 83 total children
WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 32 children, there are 0 idle, and 87 total children
WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 32 children, there are 0 idle, and 91 total children
WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 32 children, there are 0 idle, and 95 total children
WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 32 children, there are 0 idle, and 99 total children
WARNING: [pool www] server reached pm.max_children setting (100), consider raising it

我尝试过的事情

重新启动

很明显，但是完全没有帮助。

增加资源，PHP-FPM子进程

增加可用的RAM，CPU没有帮助。磁盘未满，索引节点未充分使用。
随着资源的增加，我将pm.max_children设置为100。最初是40，这对于多年的运营来说是可以的。看到日志后，我尝试将其设置为75，然后是100。
另一个访问者多几倍的网站具有较少的硬件，并且工作正常。该网站没有提供任何困难的内容，主要是博客。

为完成起见，FPM配置如下所示：

pm.max_children = 100
pm.start_servers = 24
pm.min_spare_servers = 4
pm.max_spare_servers = 64
pm.max_requests = 500

在日志中也没有提及有关运行OOM的信息。

调查opcache

我读到opcache内存不足可能是罪魁祸首。 las，它有剩余的内存：

Cache hits  89757614
Cache misses    1174
Used memory 58333696
Free memory 75884032
Wasted memory   0
OOM restarts    0

Nginx超时

Nginx参数不应该成为问题，因为缓冲区和超时值似乎非常大（我假设3000的单位是秒）：

client_header_timeout 3000;
client_body_timeout 3000;
fastcgi_read_timeout 3000;
fastcgi_buffers 16 16k;
fastcgi_buffer_size 32k;

其他信息

PHP-FPM不会崩溃，其日志中除了有关子项的警告外没有任何内容
xdebug已禁用
syslog，dmesg不包含任何相关消息
php7.0，nginx 1.12.2

还有什么我可以尝试的吗？

Nginx + PHP-FPM偶尔返回502

我尝试过的事情

重新启动

增加资源，PHP-FPM子进程

调查opcache

Nginx超时

其他信息

链接到无效的内容

0 个答案: