I am currently running a Flask + SQLAlchemy + uWSGI + nginx stack for an API backend that I am developing for my application. I was attempting to see what is the maximum amount of concurrent connection I can have on my server by using ApacheBench and sending different amounts of concurrent requests to an endpoint on the server.
This endpoint would take a JSON request body, extract certain values, run a query, then return a JSON response based on the results of the query.
I ran a base test of 1 concurrent request for 10 requests, and I got an average response time of 60 milliseconds.
Running another test with 10 concurrent requests for 100 requests returned an average of 150ms, 100 concurrent requests for 1000 returned 1500ms and 500 concurrent requests returned around 7000-9000ms.
Concurrency Level: 500
Time taken for tests: 38.710 seconds
Complete requests: 5000
Failed requests: 0
Total transferred: 1310000 bytes
Total body sent: 1105000
HTML transferred: 110000 bytes
Requests per second: 129.17 [#/sec] (mean)
Time per request: 3870.986 [ms] (mean)
Time per request: 7.742 [ms] (mean, across all concurrent requests)
Transfer rate: 33.05 [Kbytes/sec] received
27.88 kb/s sent
60.93 kb/s total
Connection Times (ms)
min mean[+/-sd] median max
Connect: 24 63 185.1 25 3025
Processing: 161 3737 2778.7 3316 26719
Waiting: 157 3737 2778.7 3316 26719
Total: 187 3800 2789.7 3366 26744
Percentage of the requests served within a certain time (ms)
50% 3366
66% 4135
75% 4862
80% 5711
90% 7449
95% 9158
98% 11794
99% 13373
100% 26744 (longest request)
The latency seems to be linearly increasing which makes sense, but it seemed to increase TOO quickly. After doing a lot of tinkering and profiling, I found that the bottleneck was the queries.
At the start of the benchmark, the queries would process and return quickly under 10-50ms, but it quickly increased and in some cases latencies of 10000-15000 ms were being seen.
I couldn't figure out why the db was slowing down so much especially since it is empty (barring test data).
I tried running the application WITHOUT a connection pool, and the results showed that latency went down (7-9s to 5-6s). I did not think this was possible because everything I read suggested having a connection pool will always make things faster because you avoid the overhead of establishing a new connection every time you make a request.
I also experimented with INCREASING the connection pool size (from the default 5 to 50), and that decreased latency even more than the pool-less setup(5-6s to 3-4s).
Concurrency Level: 500
Time taken for tests: 4.842 seconds
Complete requests: 836
Failed requests: 0
Non-2xx responses: 679
Total transferred: 272673 bytes
Total body sent: 294593
HTML transferred: 126353 bytes
Requests per second: 172.67 [#/sec] (mean)
Time per request: 2895.662 [ms] (mean)
Time per request: 5.791 [ms] (mean, across all concurrent requests)
Transfer rate: 55.00 [Kbytes/sec] received
59.42 kb/s sent
114.42 kb/s total
Connection Times (ms)
min mean[+/-sd] median max
Connect: 24 170 273.1 93 1039
Processing: 25 665 1095.2 149 4753
Waiting: 25 665 1095.2 149 4753
Total: 51 835 1123.9 279 4786
Percentage of the requests served within a certain time (ms)
50% 279
66% 487
75% 1050
80% 1059
90% 2935
95% 3658
98% 4176
99% 4337
100% 4786 (longest request)
The latency is still extremely high (3-4 seconds for an API seems unreasonable by any standard), and I am trying to figure out how I can decrease it even more. Is the answer just more connections?
Note: I am running 4 uWSGI processes with 100 threads each on a server with 4GB ram and a quad core processor. I am using the psycopg2-cffi adapter for the connection, and the application is running on PyPy.
答案 0 :(得分:1)
如果数据库必须按顺序处理您的查询,则线性增加非常正常。本质上,所有并发请求都是在同一时间启动,但是一个接一个地完成,因此,假设一个池具有单个连接,每个请求60ms,以及10个并发请求,您将看到请求需要60ms,120ms,180ms ,240ms,300ms,360ms,420ms,480ms,540ms,600ms,600ms,...,600ms,540ms,480ms,.......在给定n
个请求和m
并发请求的情况下,我们可以计算平均请求所需的时间:
f(n, m) = 60ms * (((m + 1) * m / 2) * 2 + (n - 2m) * m) / n
f(100, 10) = 546ms
f(1000, 100) = 5406ms
f(1000, 500) = 15,030ms
这些数字与您所看到的类似。
现在出现了一个大问题。为什么数据库几乎按顺序处理查询?我可以想到几个原因:
你如何解决这个问题?这是一个复杂的问题,但有一些潜在的解决方案: