Question

如何诊断PostgreSQL性能问题？

我有一个基于Django的webapp使用PostgreSQL作为Ubuntu 12上的数据库后端，并且在高负载下，数据库似乎刚刚消失，导致Django接口无法访问并导致错误，如：

django.db.utils.DatabaseError: error with no message from the libpq

django.db.utils.DatabaseError: server closed the connection unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request.

奇怪的是/ var / log / postgresql中的日志没有显示异常。日志/var/log/postgresql/postgresql-9.1-main.log显示的唯一内容包括：

2012-09-01 12:24:01 EDT LOG:  unexpected EOF on client connection

运行top表明PostgreSQL似乎没有消耗任何CPU，即使service postgresql status表示它仍在运行。

执行'service postgresql restart`会暂时解决问题，但是一旦数据库上出现大量负载，问题就会立即恢复。

我已经检查了dmesg和syslog，但是我没有看到任何可以解释错误的东西。我应该检查哪些其他日志？如何确定PostgreSQL服务器出了什么问题？

编辑：我的max_connections设置为100.虽然我做了很多手动交易。阅读Django在手动模式下使用PostgreSQL的ORM行为，看起来我可能必须明确地执行connection.close（），我不会这样做。

Answer 1

我发现这是由于Django的bugg Postgres-backend与多处理相结合。从本质上讲，Django没有自动正确地关闭它的连接，导致一些奇怪的行为，如大量的“交易中空闲”连接。我通过将connection.close()添加到我的多处理启动函数的末尾以及抛出此错误的某些查询之前修复了它。

Answer 2

2012-09-01 12:24:01 EDT LOG:  unexpected EOF on client connection

此消息显示，因此某些问题在客户端 - 可能是 libpq 的一些例外？可能存在相关问题 - 当客户端在没有正确注销的情况下挂起时，您就会有大量空闲连接，并且您很早就会收到其他错误。

Answer 3

程序pg_ctl有一些可能帮助的选项。（man pg_ctl）

   -c
       Attempt to allow server crashes to produce core files, on platforms
       where this is possible, by lifting any soft resource limit placed
       on core files. This is useful in debugging or diagnosing problems
       by allowing a stack trace to be obtained from a failed server
       process.

   -l filename
       Append the server log output to filename. If the file does not
       exist, it is created. The umask is set to 077, so access to the log
       file is disallowed to other users by default.

程序postgres也有一些调试选项。（man postgres）

   -d debug-level
       Sets the debug level. The higher this value is set, the more
       debugging output is written to the server log. Values are from 1 to
       5. It is also possible to pass -d 0 for a specific session, which
       will prevent the server log level of the parent postgres process
       from being propagated to this session.

“半内部选项”部分。。

   -n
       This option is for debugging problems that cause a server process
       to die abnormally. The ordinary strategy in this situation is to
       notify all other server processes that they must terminate and then
       reinitialize the shared memory and semaphores. This is because an
       errant server process could have corrupted some shared state before
       terminating. This option specifies that postgres will not
       reinitialize shared data structures. A knowledgeable system
       programmer can then use a debugger to examine shared memory and
       semaphore state.

   -T
       This option is for debugging problems that cause a server process
       to die abnormally. The ordinary strategy in this situation is to
       notify all other server processes that they must terminate and then
       reinitialize the shared memory and semaphores. This is because an
       errant server process could have corrupted some shared state before
       terminating. This option specifies that postgres will stop all
       other server processes by sending the signal SIGSTOP, but will not
       cause them to terminate. This permits system programmers to collect
       core dumps from all server processes by hand.

PostgreSQL数据库服务器无响应

3 个答案: