Troubleshooting dotnet process in docker container

时间:2018-09-28 18:41:36

标签: docker .net-core amazon-ecs asp.net-core-2.1

We are building a microservices platform based on dotnet core and docker. We are hosting this inside AWS using ECS running linux containers on linux hosts.

I am running into a problem with the dotnet processes stuck at 100% cpu on the host after a load test is finished while receiving no traffic. I have been trying to troubleshoot some performance issues related to this we have been having and I have done the following things:

  1. Made HttpClient a singleton so it can reuse connections
  2. Increased memory size of my containers from 128mb to 256mb (the dotnet containers wanted to grow larger than 128)

These updates have helped a bit, but I am still seeing strange behavior in the dotnet process running on the host. Locally this problem does not happen, I can run a load test, and while the test is running cpus are high, but once its done they go back down. On the EC2 host the processes say at 100% for a few minutes after.

Has anyone experienced things like this, or have any ideas on how to troubleshoot this? I have tried looking at process information on the host, but can't see much.

Here is an example of what my machine looks like after the load tests have finished but the server is at 100% cpu:

top and docker stats on host

中变量的XPath

-----------编辑2018-10-01 -----------

我在将dotnet日志记录级别设置为debug的情况下运行了负载测试,结果如下:

    TIME 18:45:45 - Last requests goes through

dbug: Microsoft.AspNetCore.Server.Kestrel[9]
Connection id "0HLH7PMRMMAFR" completed keep alive response.
info: Microsoft.AspNetCore.Hosting.Internal.WebHost[2]
Request finished in 25.9946ms 200 application/json; charset=utf-8
Date=2018-10-01T18:45:45&Service=user&RequestTime=157&PortalId=56&Path=/user/56/v1/user&Method=POST&Action=POST user/{portalid}/v1/user&IPAddress=_IP_&ApiKey=__Key&ResponseCode=200&RequestBody=_BodyData_&Response=_responseData_&ContainerId=f7e4bf541a31&RequestId=
dbug: Microsoft.AspNetCore.Server.Kestrel[9]
Connection id "0HLH7PMRMMAFP" completed keep alive response.
info: Microsoft.AspNetCore.Hosting.Internal.WebHost[2]
Request finished in 160.5087ms 200 application/json; charset=utf-8

    TIME 18:46:01 - 18:47:45  See some HealthCheck requests come in

dbug: Microsoft.AspNetCore.Server.Kestrel[1]
Connection id "0HLH7PMRMMAFV" started.
dbug: Microsoft.AspNetCore.Server.Kestrel[1]
Connection id "0HLH7PMRMMAG0" started.
info: Microsoft.AspNetCore.Hosting.Internal.WebHost[1]
Request starting HTTP/1.1 GET http://10.0.1.73:32800/apigateway/0/v1/info
info: Microsoft.AspNetCore.Hosting.Internal.WebHost[2]
Request finished in 0.0899ms 200 application/json
dbug: Microsoft.AspNetCore.Server.Kestrel[10]
Connection id "0HLH7PMRMMAG0" disconnecting.
dbug: Microsoft.AspNetCore.Server.Kestrel.Transport.Sockets[7]
Connection id "0HLH7PMRMMAG0" sending FIN.
dbug: Microsoft.AspNetCore.Server.Kestrel[2]
Connection id "0HLH7PMRMMAG0" stopped.
info: Microsoft.AspNetCore.Hosting.Internal.WebHost[1]
Request starting HTTP/1.1 GET http://10.0.1.73:32800/apigateway/0/v1/info
info: Microsoft.AspNetCore.Hosting.Internal.WebHost[2]
Request finished in 0.056ms 200 application/json
dbug: Microsoft.AspNetCore.Server.Kestrel[10]
Connection id "0HLH7PMRMMAFV" disconnecting.
dbug: Microsoft.AspNetCore.Server.Kestrel.Transport.Sockets[7]
Connection id "0HLH7PMRMMAFV" sending FIN.
dbug: Microsoft.AspNetCore.Server.Kestrel[2]
Connection id "0HLH7PMRMMAFV" stopped.
dbug: Microsoft.AspNetCore.Server.Kestrel[1]

    TIME: 18:47:45 - 18:47:47 - Connections finally are closed?

dbug: Microsoft.AspNetCore.Server.Kestrel[2]
Connection id "0HLH7PMRMMAGA" stopped.
dbug: Microsoft.AspNetCore.Server.Kestrel[10]
Connection id "0HLH7PMRMMAFN" disconnecting.
dbug: Microsoft.AspNetCore.Server.Kestrel[10]
Connection id "0HLH7PMRMMAFM" disconnecting.
dbug: Microsoft.AspNetCore.Server.Kestrel.Transport.Sockets[7]
Connection id "0HLH7PMRMMAFM" sending FIN.
dbug: Microsoft.AspNetCore.Server.Kestrel.Transport.Sockets[7]
Connection id "0HLH7PMRMMAFN" sending FIN.
dbug: Microsoft.AspNetCore.Server.Kestrel[2]
Connection id "0HLH7PMRMMAFM" stopped.
dbug: Microsoft.AspNetCore.Server.Kestrel[2]
Connection id "0HLH7PMRMMAFN" stopped.
dbug: Microsoft.AspNetCore.Server.Kestrel[10]
Connection id "0HLH7PMRMMAFK" disconnecting.
dbug: Microsoft.AspNetCore.Server.Kestrel[10]
Connection id "0HLH7PMRMMAFL" disconnecting.
dbug: Microsoft.AspNetCore.Server.Kestrel[10]
Connection id "0HLH7PMRMMAFU" disconnecting.
dbug: Microsoft.AspNetCore.Server.Kestrel.Transport.Sockets[7]
Connection id "0HLH7PMRMMAFK" sending FIN.
dbug: Microsoft.AspNetCore.Server.Kestrel.Transport.Sockets[7]
Connection id "0HLH7PMRMMAFU" sending FIN.
dbug: Microsoft.AspNetCore.Server.Kestrel.Transport.Sockets[7]
Connection id "0HLH7PMRMMAFL" sending FIN.
dbug: Microsoft.AspNetCore.Server.Kestrel[2]
Connection id "0HLH7PMRMMAFK" stopped.
dbug: Microsoft.AspNetCore.Server.Kestrel[2]
Connection id "0HLH7PMRMMAFU" stopped.
dbug: Microsoft.AspNetCore.Server.Kestrel[2]
Connection id "0HLH7PMRMMAFL" stopped.
dbug: Microsoft.AspNetCore.Server.Kestrel[10]
Connection id "0HLH7PMRMMAFS" disconnecting.
dbug: Microsoft.AspNetCore.Server.Kestrel.Transport.Sockets[7]
Connection id "0HLH7PMRMMAFS" sending FIN.
dbug: Microsoft.AspNetCore.Server.Kestrel[2]
Connection id "0HLH7PMRMMAFS" stopped.
dbug: Microsoft.AspNetCore.Server.Kestrel[10]
Connection id "0HLH7PMRMMAFO" disconnecting.
dbug: Microsoft.AspNetCore.Server.Kestrel[10]
Connection id "0HLH7PMRMMAFP" disconnecting.
dbug: Microsoft.AspNetCore.Server.Kestrel[10]
Connection id "0HLH7PMRMMAFR" disconnecting.
dbug: Microsoft.AspNetCore.Server.Kestrel[10]
Connection id "0HLH7PMRMMAFQ" disconnecting.
dbug: Microsoft.AspNetCore.Server.Kestrel[10]
Connection id "0HLH7PMRMMAFT" disconnecting.
dbug: Microsoft.AspNetCore.Server.Kestrel.Transport.Sockets[7]
Connection id "0HLH7PMRMMAFP" sending FIN.
dbug: Microsoft.AspNetCore.Server.Kestrel.Transport.Sockets[7]
Connection id "0HLH7PMRMMAFO" sending FIN.
dbug: Microsoft.AspNetCore.Server.Kestrel.Transport.Sockets[7]
Connection id "0HLH7PMRMMAFR" sending FIN.
dbug: Microsoft.AspNetCore.Server.Kestrel.Transport.Sockets[7]
Connection id "0HLH7PMRMMAFQ" sending FIN.
dbug: Microsoft.AspNetCore.Server.Kestrel.Transport.Sockets[7]
Connection id "0HLH7PMRMMAFT" sending FIN.
dbug: Microsoft.AspNetCore.Server.Kestrel[2]
Connection id "0HLH7PMRMMAFP" stopped.
dbug: Microsoft.AspNetCore.Server.Kestrel[2]
Connection id "0HLH7PMRMMAFO" stopped.
dbug: Microsoft.AspNetCore.Server.Kestrel[2]
Connection id "0HLH7PMRMMAFR" stopped.
dbug: Microsoft.AspNetCore.Server.Kestrel[2]
Connection id "0HLH7PMRMMAFQ" stopped.
dbug: Microsoft.AspNetCore.Server.Kestrel[2]
Connection id "0HLH7PMRMMAFT" stopped.

在18:47:47,CPU最终回落。看起来问题出在Kestrel使连接保持活动状态两分钟,而这样做却使CPU耗尽。

我应该如何处理?我可以考虑不尊重响应中的Keep-Alive标头:Connection id "0HLH7PMRMMAFR" completed keep alive response.但Kestrel不应继续重用此连接而不是创建新连接吗?

我想我不能在本地重现此内容,因为它必须是AWS ALB插入一个保持活动的标头吗?

1 个答案:

答案 0 :(得分:1)

我想我找到了问题!

https://github.com/aspnet/KestrelHttpServer/issues/2694

更新为2.1.4,它消失了。请记住,如果您使用的是最新版本的框架,请务必检查是否有任何新的更新和错误修正:)