Question

我的环境中有两个r5a.xlarge ec2实例。每个实例具有4个vCPU和32个GiB内存应用程序处理一些文件，将json中的数据返回给客户端。它处理的两个文件也很大（约1.5GB）。我没有数据库连接。该应用程序将Python 3.6与flask配合使用，并在apache服务器上运行

在一些传入请求之后，实例将进入“降级”状态。显示的原因是：

25.0 % of the requests are failing with HTTP 5xx.
Instance ELB health state has been "OutOfService" for 1 hour 23 minutes: Instance has failed at least the UnhealthyThreshold number of health checks consecutively.
99 % of CPU is in use. 96 % in I/O wait.
100 % of memory is in use.

尽管停止了传入请求，但仍保持这种状态。

另一个实例由于某种原因部署了错误的版本。

Incorrect application version "app-xxxxxxx" (deployment 24). Expected version "app-yyyyyy" (deployment 23).

我将负载均衡器的容量设置为0。这删除了两个实例。我重新部署了该应用程序，然后将容量设置回原始设置，即Min = 1，Max = 2，Desired = 2

我这样做是为了使它具有具有正确代码基础版本的新实例。

现在可以运行1个实例，并且在7-8个以上的请求之后，它又再次进入降级状态。原因再次是

100 % of memory is in use.

我已尝试创建here

所述的交换空间

我什至检查了httpd_error日志，但没有发现与此相关的任何错误。这是httpd_error文件中的全部错误

[suexec:notice] [pid 2880] AH01232: suEXEC mechanism enabled (wrapper: /usr/sbin/suexec)
[http2:warn] [pid 2880] AH10034: The mpm module (prefork.c) is not supported by mod_http2. The mpm determines how things are processed in your server. HTTP/2 has more demands in this regard and the currently selected mpm will just not do. This is an advisory warning. Your server will continue to work, but the HTTP/2 protocol will be inactive.
[http2:warn] [pid 2880] AH02951: mod_ssl does not seem to be enabled
[lbmethod_heartbeat:notice] [pid 2880] AH02282: No slotmem from mod_heartmonitor
[:warn] [pid 2880] mod_wsgi: Compiled for Python/3.6.2.
[:warn] [pid 2880] mod_wsgi: Runtime using Python/3.6.12.
[mpm_prefork:notice] [pid 2880] AH00163: Apache/2.4.46 (Amazon) mod_wsgi/3.5 Python/3.6.12 configured -- resuming normal operations

我什至如何开始解决这个问题？

Answer 1

问题出在应用程序中。我必须对应用程序读取和处理数据的方式进行一些重大更改。该应用程序正在为每个请求读取大量数据-这几乎导致服务器过载和挂起。

Elastic beantalk中健康状况下降-无法找出原因

1 个答案: