我的应用程序基于带有后端(django),前端和其他相关Pod的kubernetes集群。它还有一些服务,其中之一是postgresql数据库。 我有我的应用程序的两个实例(dev和prod-它们存在于不同的名称空间中),下面将讨论的问题都存在。
有时,我在执行与访问数据库有关的操作时会遇到麻烦(例如尝试登录)。在kubernetes后端Pod日志中,我看到以下错误消息结尾:
...
django.db.utils.OperationalError: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
通常在10-30分钟内发生,然后自行消失。我不知道如何复制或临时解决此问题,因此我只能注意它,我将尝试在本文中描述一些观察结果。
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.postgresql',
'NAME': os.environ.get('POSTGRES_DBNAME'),
'USER': os.environ.get('POSTGRES_NAME'),
'HOST': os.environ.get('POSTGRES_HOST'),
'PASSWORD': os.environ.get('POSTGRES_PASS'),
'PORT': os.environ.get('POSTGRES_PORT'),
'CONN_MAX_AGE': None,
}
存在问题时,我发现了以下内容:
psql -d POSTGRES_DBNAME -U POSTGRES_NAME -h SERVER_IP -p SERVER_PORT
在这种情况下,我看到相同的django.db.utils.OperationalError。
2020-08-09 19:21:48.700 GMT [4847] FATAL: password authentication failed for user "postgres"
2020-08-09 19:21:48.700 GMT [4847] DETAIL: Password does not match for user "postgres".
Connection matched pg_hba.conf line 95: "host all all all md5"
2020-08-09 19:21:48.706 GMT [4848] FATAL: password authentication failed for user "postgres"
2020-08-09 19:21:48.706 GMT [4848] DETAIL: Password does not match for user "postgres".
Connection matched pg_hba.conf line 95: "host all all all md5"
2020-08-09 19:21:48.722 GMT [4849] FATAL: password authentication failed for user "postgres"
2020-08-09 19:21:48.722 GMT [4849] DETAIL: Password does not match for user "postgres".
Connection matched pg_hba.conf line 95: "host all all all md5"
2020-08-09 19:21:49.735 GMT [4850] FATAL: password authentication failed for user "postgres"
2020-08-09 19:21:49.735 GMT [4850] DETAIL: Password does not match for user "postgres".
Connection matched pg_hba.conf line 95: "host all all all md5"
2020-08-09 19:21:49.742 GMT [4851] FATAL: password authentication failed for user "admin"
2020-08-09 19:21:49.742 GMT [4851] DETAIL: Role "admin" does not exist.
Connection matched pg_hba.conf line 95: "host all all all md5"
2020-08-09 19:21:49.757 GMT [4852] FATAL: password authentication failed for user "admin"
2020-08-09 19:21:49.757 GMT [4852] DETAIL: Role "admin" does not exist.
Connection matched pg_hba.conf line 95: "host all all all md5"
在日志中,我也收到很多这样的消息:
2020-08-12 07:15:40.215 GMT [15706] FATAL: unsupported frontend protocol 0.0: server supports 2.0 to 3.0
2020-08-12 07:15:40.477 GMT [15707] FATAL: unsupported frontend protocol 255.255: server supports 2.0 to 3.0
2020-08-12 07:15:40.751 GMT [15708] FATAL: no PostgreSQL user name specified in startup packet
2020-08-12 09:31:41.343 GMT [15957] LOG: could not receive data from client: Connection reset by peer
2020-08-12 10:09:48.876 GMT [16117] LOG: could not receive data from client: Connection reset by peer
2020-08-12 10:16:27.459 GMT [16233] LOG: could not receive data from client: Connection reset by peer
2020-08-12 11:19:02.012 GMT [16325] LOG: could not receive data from client: Connection reset by peer
2020-08-12 12:40:14.555 GMT [16441] LOG: could not receive data from client: Connection reset by peer
2020-08-12 13:37:29.711 GMT [16826] LOG: could not receive data from client: Connection reset by peer
2020-08-12 15:13:01.149 GMT [16856] LOG: could not receive data from client: Connection reset by peer
2020-08-12 18:00:02.185 GMT [17644] LOG: invalid length of startup packet
我想找出问题的根源,但没有很多选择可以解决。我很高兴听到任何建议。