几分钟后PostDock(Postgres + Docker)关闭

时间:2018-10-18 19:06:30

标签: postgresql docker load-balancing pg pgpool

我正在使用Postdock在Postgres主从之间进行负载平衡。 PostDock

我提取了最新版本的github存储库并运行docker-compose -f ./docker-compose/latest.yml up -d pgmaster pgslave1 pgslave2 pgslave3 pgslave4 pgpool backup

首先,pgpool容器处于启动状态,pgmaster和4个从属处于运行状态。但是几分钟后,主服务器关闭,然后其他容器关闭。

您能帮我解决这个问题吗?谢谢!

Pgpool日志容器:

>>> STARTING SSH (if required)...
>>> TUNING UP SSH CLIENT...
> STARTING SSH SERVER...
>>> TURNING PGPOOL...
>>> Opening access from all hosts by md5 in /usr/local/etc/pool_hba.conf
>>> Adding user pcp_user for PCP
>>> Creating a ~/.pcppass file for pcp_user
>>> Adding users for md5 auth
>>>>>> Adding user monkey_user
>>> Adding check user 'monkey_user' for md5 auth
>>> Adding user 'monkey_user' as check user
>>> Adding user 'monkey_user' as health-check user
>>> Adding backends
>>>>>> Waiting for backend 0 to start pgpool (WAIT_BACKEND_TIMEOUT=60)
2018/10/18 18:41:48 Waiting for host: tcp://pgmaster:5432
2018/10/18 18:41:53 Connected to tcp://pgmaster:5432
>>>>>> Adding backend 0
>>>>>> Waiting for backend 1 to start pgpool (WAIT_BACKEND_TIMEOUT=60)
2018/10/18 18:41:53 Waiting for host: tcp://pgslave1:5432
2018/10/18 18:42:53 Timeout after 1m0s waiting on dependencies to become available: [tcp://pgslave1:5432]
>>>>>> Will not add node 1 - it's unreachable!
>>>>>> Waiting for backend 3 to start pgpool (WAIT_BACKEND_TIMEOUT=60)
2018/10/18 18:42:53 Waiting for host: tcp://pgslave3:5432
2018/10/18 18:43:53 Timeout after 1m0s waiting on dependencies to become available: [tcp://pgslave3:5432]
>>>>>> Will not add node 3 - it's unreachable!
>>>>>> Waiting for backend 2 to start pgpool (WAIT_BACKEND_TIMEOUT=60)
2018/10/18 18:43:53 Waiting for host: tcp://pgslave2:5432
2018/10/18 18:44:53 Timeout after 1m0s waiting on dependencies to become available: [tcp://pgslave2:5432]
>>>>>> Will not add node 2 - it's unreachable!
>>> Checking if we have enough backends to start
>>>>>> Can not start pgpool with REQUIRE_MIN_BACKENDS=3, BACKENDS_COUNT=1

和pgmaster容器日志:

>> Recovery is in progress:
2018-10-18 18:41:53.856 UTC [73] LOG:  listening on IPv4 address "0.0.0.0", port 5432
2018-10-18 18:41:53.856 UTC [73] LOG:  listening on IPv6 address "::", port 5432
2018-10-18 18:41:54.059 UTC [73] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2018-10-18 18:41:55.279 UTC [94] LOG:  database system was interrupted; last known up at 2018-10-17 19:26:34 UTC
2018-10-18 18:41:55.824 UTC [95] LOG:  incomplete startup packet
2018-10-18 18:41:59.368 UTC [96] FATAL:  the database system is starting up
2018-10-18 18:41:59.937 UTC [97] FATAL:  the database system is starting up
2018-10-18 18:42:08.669 UTC [94] LOG:  database system was not properly shut down; automatic recovery in progress
2018-10-18 18:42:09.080 UTC [94] LOG:  redo starts at 0/1634170
2018-10-18 18:42:09.135 UTC [98] FATAL:  the database system is starting up
2018-10-18 18:42:09.136 UTC [99] FATAL:  the database system is starting up
2018-10-18 18:42:09.197 UTC [94] LOG:  invalid record length at 0/16348F0: wanted 24, got 0
2018-10-18 18:42:09.197 UTC [94] LOG:  redo done at 0/16348B8
2018-10-18 18:42:09.197 UTC [94] LOG:  last completed transaction was at log time 2018-10-17 19:26:34.359763+00
2018-10-18 18:42:09.406 UTC [100] FATAL:  the database system is starting up
2018-10-18 18:42:10.005 UTC [101] FATAL:  the database system is starting up
2018-10-18 18:42:10.374 UTC [102] FATAL:  the database system is starting up
2018-10-18 18:42:10.757 UTC [103] FATAL:  the database system is starting up
2018-10-18 18:42:10.860 UTC [104] FATAL:  the database system is starting up
2018-10-18 18:42:12.480 UTC [73] LOG:  database system is ready to accept connections
2018-10-18 18:42:19.573 UTC [111] FATAL:  database "replication_db" does not exist
2018-10-18 18:42:20.046 UTC [112] FATAL:  database "replication_db" does not exist
>>>>>> RECOVERY_WAL_ID is empty!
>>> Not in recovery state (anymore)
>>> Waiting for local postgres server start...
>>> Wait schema replication_db.public on pgmaster:5432(user: replication_user,password: *******), will try 9 times with delay 10 seconds (TIMEOUT=90)
2018-10-18 18:42:21.075 UTC [134] FATAL:  database "replication_db" does not exist
psql: FATAL:  database "replication_db" does not exist
2018-10-18 18:42:29.630 UTC [136] FATAL:  database "replication_db" does not exist
2018-10-18 18:42:30.089 UTC [137] FATAL:  database "replication_db" does not exist
>>>>>> Host pgmaster:5432 is not accessible (will try 9 times more)
2018-10-18 18:42:31.118 UTC [147] FATAL:  database "replication_db" does not exist
psql: FATAL:  database "replication_db" does not exist
2018-10-18 18:42:39.671 UTC [149] FATAL:  database "replication_db" does not exist
2018-10-18 18:42:40.132 UTC [150] FATAL:  database "replication_db" does not exist
>>>>>> Host pgmaster:5432 is not accessible (will try 8 times more)
2018-10-18 18:42:41.162 UTC [160] FATAL:  database "replication_db" does not exist
psql: FATAL:  database "replication_db" does not exist
2018-10-18 18:42:49.713 UTC [162] FATAL:  database "replication_db" does not exist
2018-10-18 18:42:50.171 UTC [163] FATAL:  database "replication_db" does not exist
>>>>>> Host pgmaster:5432 is not accessible (will try 7 times more)
2018-10-18 18:42:51.203 UTC [173] FATAL:  database "replication_db" does not exist
psql: FATAL:  database "replication_db" does not exist
2018-10-18 18:42:59.754 UTC [175] FATAL:  database "replication_db" does not exist
2018-10-18 18:43:00.214 UTC [176] FATAL:  database "replication_db" does not exist
>>>>>> Host pgmaster:5432 is not accessible (will try 6 times more)
2018-10-18 18:43:01.247 UTC [186] FATAL:  database "replication_db" does not exist
psql: FATAL:  database "replication_db" does not exist
2018-10-18 18:43:02.381 UTC [189] FATAL:  no pg_hba.conf entry for replication connection from host "172.22.0.6", user "replication_user", SSL off
2018-10-18 18:43:02.703 UTC [190] FATAL:  no pg_hba.conf entry for replication connection from host "172.22.0.6", user "replication_user", SSL off
2018-10-18 18:43:03.019 UTC [191] FATAL:  no pg_hba.conf entry for replication connection from host "172.22.0.6", user "replication_user", SSL off


2018-10-18 18:43:03.125 UTC [192] FATAL:  no pg_hba.conf entry for replication connection from host "172.22.0.6", user "replication_user", SSL off


2018-10-18 18:43:09.798 UTC [193] FATAL:  database "replication_db" does not exist


2018-10-18 18:43:10.255 UTC [194] FATAL:  database "replication_db" does not exist


>>>>>> Host pgmaster:5432 is not accessible (will try 5 times more)


2018-10-18 18:43:11.293 UTC [204] FATAL:  database "replication_db" does not exist


psql: FATAL:  database "replication_db" does not exist


2018-10-18 18:43:19.838 UTC [207] FATAL:  database "replication_db" does not exist


2018-10-18 18:43:20.298 UTC [208] FATAL:  database "replication_db" does not exist


>>>>>> Host pgmaster:5432 is not accessible (will try 4 times more)


2018-10-18 18:43:21.345 UTC [218] FATAL:  database "replication_db" does not exist


psql: FATAL:  database "replication_db" does not exist


2018-10-18 18:43:29.880 UTC [220] FATAL:  database "replication_db" does not exist


2018-10-18 18:43:30.340 UTC [221] FATAL:  database "replication_db" does not exist


>>>>>> Host pgmaster:5432 is not accessible (will try 3 times more)

1 个答案:

答案 0 :(得分:0)

我找到了解决方案。需要为每个节点配置环境变量(节点名称,REPLICATION_PRIMARY_HOST。BACKEND)!