我一直在尝试实现这一点,但我无法弄清楚为什么它不起作用。我读过许多人按原样下载和运行,但是pgpool从未连接到主服务器或从服务器。我从问题57的paunin的示例中提取了docker文件,并将映像更改为当前的postdock / postgres。
我的docker组成如下,我从以下命令开始:
docker-compose -f .\basic.yml up -d
version: '2'
networks:
cluster:
driver: bridge
services:
pgmaster:
image: postdock/postgres
environment:
PARTNER_NODES: "pgmaster,pgslave1"
NODE_ID: 1 # Integer number of node
NODE_NAME: node1 # Node name
CLUSTER_NODE_NETWORK_NAME: pgmaster
POSTGRES_PASSWORD: monkey_pass
POSTGRES_USER: monkey_user
POSTGRES_DB: monkey_db
CONFIGS: "listen_addresses:'*'"
ports:
- 5431:5432
networks:
cluster:
aliases:
- pgmaster
pgslave1:
image: postdock/postgres
environment:
PARTNER_NODES: "pgmaster,pgslave1"
REPLICATION_PRIMARY_HOST: pgmaster
NODE_ID: 2
NODE_NAME: node2
CLUSTER_NODE_NETWORK_NAME: pgslave1
ports:
- 5441:5432
networks:
cluster:
aliases:
- pgslave1
pgpool:
image: postdock/pgpool
environment:
PCP_USER: pcp_user
PCP_PASSWORD: pcp_pass
WAIT_BACKEND_TIMEOUT: 60
CHECK_USER: monkey_user
CHECK_PASSWORD: monkey_pass
CHECK_PGCONNECT_TIMEOUT: 3
DB_USERS: monkey_user:monkey_pass
BACKENDS: "0:pgmaster:5432:1:/var/lib/postgresql/data:ALLOW_TO_FAILOVER,1:pgslave1::::"
CONFIGS: "num_init_children:250,max_pool:4"
ports:
- 5432:5432
- 9898:9898 # PCP
networks:
cluster:
aliases:
- pgpool
```
主数据库和复制数据库似乎都正常运行。我既可以在pgAdmin中看到它们,也可以创建一个表并看到它出现在monkey_db中。但是,它永远不会移到副本。
以下是主容器的日志:
PS C:\platform\docker\basic> docker logs basic_pgmaster_1
>>> Setting up STOP handlers...
>>> STARTING SSH (if required)...
No pre-populated ssh keys!
cp: cannot stat '/home/postgres/.ssh/keys/*': No such file or directory
>>> SSH is not enabled!
>>> STARTING POSTGRES...
>>> SETTING UP POLYMORPHIC VARIABLES (repmgr=3+postgres=9 | repmgr=4, postgres=10)...
>>> TUNING UP POSTGRES...
>>> Cleaning data folder which might have some garbage...
>>> Check all partner nodes for common upstream node...
>>>>>> Checking NODE=pgmaster...
psql: could not connect to server: Connection refused
Is the server running on host "pgmaster" (172.22.0.3) and accepting
TCP/IP connections on port 5432?
>>>>>> Skipping: failed to get master from the node!
>>>>>> Checking NODE=pgslave1...
psql: could not connect to server: Connection refused
Is the server running on host "pgslave1" (172.22.0.2) and accepting
TCP/IP connections on port 5432?
>>>>>> Skipping: failed to get master from the node!
>>> Auto-detected master name: ''
>>> Setting up repmgr...
>>> Setting up repmgr config file '/etc/repmgr.conf'...
>>> Setting up upstream node...
>>> Sending in background postgres start...
>>> Waiting for local postgres server recovery if any in progress:LAUNCH_RECOVERY_CHECK_INTERVAL=30
>>> Recovery is in progress:
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.
The database cluster will be initialized with locale "en_US.utf8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".
Data page checksums are disabled.
fixing permissions on existing directory /var/lib/postgresql/data ... ok
creating subdirectories ... ok
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting dynamic shared memory implementation ... posix
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
syncing data to disk ... ok
Success. You can now start the database server using:
pg_ctl -D /var/lib/postgresql/data -l logfile start
WARNING: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.
waiting for server to start....2018-09-20 06:03:29.170 UTC [85] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2018-09-20 06:03:29.197 UTC [86] LOG: database system was shut down at 2018-09-20 06:03:28 UTC
2018-09-20 06:03:29.202 UTC [85] LOG: database system is ready to accept connections
done
server started
CREATE DATABASE
CREATE ROLE
/docker-entrypoint.sh: running /docker-entrypoint-initdb.d/entrypoint.sh
>>> Configuring /var/lib/postgresql/data/postgresql.conf
>>>>>> Config file was replaced with standard one!
>>>>>> Adding config 'listen_addresses'=''*''
>>>>>> Adding config 'shared_preload_libraries'=''repmgr_funcs''
>>> Creating replication user 'replication_user'
CREATE ROLE
>>> Creating replication db 'replication_db'
waiting for server to shut down...2018-09-20 06:03:30.494 UTC [85] LOG: received fast shutdown request
.2018-09-20 06:03:30.514 UTC [85] LOG: aborting any active transactions
2018-09-20 06:03:30.517 UTC [85] LOG: worker process: logical replication launcher (PID 92) exited with exit code 1
2018-09-20 06:03:30.517 UTC [87] LOG: shutting down
2018-09-20 06:03:30.542 UTC [85] LOG: database system is shut down
done
server stopped
PostgreSQL init process complete; ready for start up.
2018-09-20 06:03:30.608 UTC [47] LOG: listening on IPv4 address "0.0.0.0", port 5432
2018-09-20 06:03:30.608 UTC [47] LOG: listening on IPv6 address "::", port 5432
2018-09-20 06:03:30.616 UTC [47] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2018-09-20 06:03:30.646 UTC [131] LOG: database system was shut down at 2018-09-20 06:03:30 UTC
2018-09-20 06:03:30.664 UTC [47] LOG: database system is ready to accept connections
>>>>>> RECOVERY_WAL_ID is empty!
>>> Not in recovery state (anymore)
>>> Waiting for local postgres server start...
>>> Wait schema replication_db.public on pgmaster:5432(user: replication_user,password: *******), will try 9 times with delay 10 seconds (TIMEOUT=90)
>>>>>> Schema replication_db.public exists on host pgmaster:5432!
>>> Registering node with role master
INFO: connecting to master database
INFO: master register: creating database objects inside the 'repmgr_pg_cluster' schema
INFO: retrieving node list for cluster 'pg_cluster'
[REPMGR EVENT] Node id: 1; Event type: master_register; Success [1|0]: 1; Time: 2018-09-20 06:03:56.560674+00; Details:
[REPMGR EVENT] will execute script '/usr/local/bin/cluster/repmgr/events/execs/master_register.sh' for the event
[REPMGR EVENT::master_register] Node id: 1; Event type: master_register; Success [1|0]: 1; Time: 2018-09-20 06:03:56.560674+00; Details:
[REPMGR EVENT::master_register] Locking master...
[REPMGR EVENT::master_register] Unlocking standby...
NOTICE: master node correctly registered for cluster 'pg_cluster' with id 1 (conninfo: user=replication_user password=replication_pass host=pgmaster dbname=replication_db port=5432 connect_timeout=2)
>>> Starting repmgr daemon...
[2018-09-20 06:03:56] [NOTICE] looking for configuration file in current directory
[2018-09-20 06:03:56] [NOTICE] looking for configuration file in /etc
[2018-09-20 06:03:56] [NOTICE] configuration file found at: /etc/repmgr.conf
[2018-09-20 06:03:56] [INFO] connecting to database 'user=replication_user password=replication_pass host=pgmaster dbname=replication_db port=5432 connect_timeout=2'
[2018-09-20 06:03:56] [INFO] connected to database, checking its state
[2018-09-20 06:03:56] [INFO] checking cluster configuration with schema 'repmgr_pg_cluster'
[2018-09-20 06:03:56] [INFO] checking node 1 in cluster 'pg_cluster'
[2018-09-20 06:03:56] [INFO] reloading configuration file
[2018-09-20 06:03:56] [INFO] configuration has not changed
[2018-09-20 06:03:56] [INFO] starting continuous master connection check
```
这是从站的日志。看来主数据库已成功克隆:
> ```
>
> >>> Setting up STOP handlers...
> >>> STARTING SSH (if required)...
> No pre-populated ssh keys!
> cp: cannot stat '/home/postgres/.ssh/keys/*': No such file or directory
> >>> SSH is not enabled!
> >>> STARTING POSTGRES...
> >>> SETTING UP POLYMORPHIC VARIABLES (repmgr=3+postgres=9 | repmgr=4, postgres=10)...
> >>> TUNING UP POSTGRES...
> >>> Cleaning data folder which might have some garbage...
> >>> Check all partner nodes for common upstream node...
> >>>>>> Checking NODE=pgmaster...
> psql: could not connect to server: Connection refused
> Is the server running on host "pgmaster" (172.22.0.3) and accepting
> TCP/IP connections on port 5432?
> >>>>>> Skipping: failed to get master from the node!
> >>>>>> Checking NODE=pgslave1...
> psql: could not connect to server: Connection refused
> Is the server running on host "pgslave1" (172.22.0.2) and accepting
> TCP/IP connections on port 5432?
> >>>>>> Skipping: failed to get master from the node!
> >>> Auto-detected master name: ''
> >>> Setting up repmgr...
> >>> Setting up repmgr config file '/etc/repmgr.conf'...
> >>> Setting up upstream node...
> cat: /var/lib/postgresql/data/standby.lock: No such file or directory
> >>> Previously Locked standby upstream node LOCKED_STANDBY=''
> >>> Waiting for upstream postgres server...
> >>> Wait schema replication_db.repmgr_pg_cluster on pgmaster:5432(user: replication_user,password: *******), will try 30 times with delay 10 seconds (TIMEOUT=300)
> psql: could not connect to server: Connection refused
> Is the server running on host "pgmaster" (172.22.0.3) and accepting
> TCP/IP connections on port 5432?
> >>>>>> Host pgmaster:5432 is not accessible (will try 30 times more)
> >>>>>> Schema replication_db.repmgr_pg_cluster is still not accessible on host pgmaster:5432 (will try 29 times more)
> >>>>>> Schema replication_db.repmgr_pg_cluster is still not accessible on host pgmaster:5432 (will try 28 times more)
> >>>>>> Schema replication_db.repmgr_pg_cluster is still not accessible on host pgmaster:5432 (will try 27 times more)
> >>>>>> Schema replication_db.repmgr_pg_cluster exists on host pgmaster:5432!
> >>> REPLICATION_UPSTREAM_NODE_ID=1
> >>> Sending in background postgres start...
> >>> Waiting for upstream postgres server...
> >>> Wait schema replication_db.repmgr_pg_cluster on pgmaster:5432(user: replication_user,password: *******), will try 30 times with delay 10 seconds (TIMEOUT=300)
> >>>>>> Schema replication_db.repmgr_pg_cluster exists on host pgmaster:5432!
> >>> Starting standby node...
> >>> Instance hasn't been set up yet.
> >>> Clonning primary node...
> >>> Waiting for upstream postgres server...
> >>> Wait schema replication_db.repmgr_pg_cluster on pgmaster:5432(user: replication_user,password: *******), will try 30 times with delay 10 seconds (TIMEOUT=300)
> NOTICE: destination directory '/var/lib/postgresql/data' provided
> INFO: connecting to upstream node
> INFO: Successfully connected to upstream node. Current installation size is 37 MB
> INFO: checking and correcting permissions on existing directory /var/lib/postgresql/data ...
> >>>>>> Schema replication_db.repmgr_pg_cluster exists on host pgmaster:5432!
> >>> Waiting for cloning on this node is over(if any in progress): CLEAN_UP_ON_FAIL=, INTERVAL=30
> >>> Replicated: 4
> NOTICE: starting backup (using pg_basebackup)...
> INFO: executing: '/usr/lib/postgresql/10/bin/pg_basebackup -l "repmgr base backup" -D /var/lib/postgresql/data -h pgmaster -p 5432 -U replication_user -c fast -X stream -S repmgr_slot_2 '
> NOTICE: standby clone (using pg_basebackup) complete
> NOTICE: you can now start your PostgreSQL server
> HINT: for example : pg_ctl -D /var/lib/postgresql/data start
> HINT: After starting the server, you need to register this standby with "repmgr standby register"
> [REPMGR EVENT] Node id: 2; Event type: standby_clone; Success [1|0]: 1; Time: 2018-09-20 06:04:08.427899+00; Details: Cloned from host 'pgmaster', port 5432; backup method: pg_basebackup; --force: Y
> >>> Configuring /var/lib/postgresql/data/postgresql.conf
> >>>>>> Will add configs to the exists file
> >>>>>> Adding config 'shared_preload_libraries'=''repmgr_funcs''
> >>> Starting postgres...
> >>> Waiting for local postgres server recovery if any in progress:LAUNCH_RECOVERY_CHECK_INTERVAL=30
> >>> Recovery is in progress:
> 2018-09-20 06:04:08.517 UTC [163] LOG: listening on IPv4 address "0.0.0.0", port 5432
> 2018-09-20 06:04:08.517 UTC [163] LOG: listening on IPv6 address "::", port 5432
> 2018-09-20 06:04:08.521 UTC [163] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
> 2018-09-20 06:04:08.549 UTC [171] LOG: database system was interrupted; last known up at 2018-09-20 06:04:06 UTC
> 2018-09-20 06:04:09.894 UTC [171] LOG: entering standby mode
> 2018-09-20 06:04:09.903 UTC [171] LOG: redo starts at 0/2000028
> 2018-09-20 06:04:09.908 UTC [171] LOG: consistent recovery state reached at 0/20000F8
> 2018-09-20 06:04:09.908 UTC [163] LOG: database system is ready to accept read only connections
> 2018-09-20 06:04:09.916 UTC [175] LOG: started streaming WAL from primary at 0/3000000 on timeline 1
> >>> Cloning is done
> >>>>>> WAL id: 000000010000000000000003
> >>>>>> WAL_RECEIVER_FLAG=1!
> >>> Not in recovery state (anymore)
> >>> Waiting for local postgres server start...
> >>> Wait schema replication_db.public on pgslave1:5432(user: replication_user,password: *******), will try 9 times with delay 10 seconds (TIMEOUT=90)
> >>>>>> Schema replication_db.public exists on host pgslave1:5432!
> >>> Unregister the node if it was done before
> DELETE 0
> >>> Registering node with role standby
> INFO: connecting to standby database
> INFO: connecting to master database
> INFO: retrieving node list for cluster 'pg_cluster'
> INFO: registering the standby
> [REPMGR EVENT] Node id: 2; Event type: standby_register; Success [1|0]: 1; Time: 2018-09-20 06:04:38.676889+00; Details:
> INFO: standby registration complete
> NOTICE: standby node correctly registered for cluster pg_cluster with id 2 (conninfo: user=replication_user password=replication_pass host=pgslave1 dbname=replication_db port=5432 connect_timeout=2)
> Locking standby (NEW_UPSTREAM_NODE_ID=1)...
> >>> Starting repmgr daemon...
> [2018-09-20 06:04:38] [NOTICE] looking for configuration file in current directory
> [2018-09-20 06:04:38] [NOTICE] looking for configuration file in /etc
> [2018-09-20 06:04:38] [NOTICE] configuration file found at: /etc/repmgr.conf
> [2018-09-20 06:04:38] [INFO] connecting to database 'user=replication_user password=replication_pass host=pgslave1 dbname=replication_db port=5432 connect_timeout=2'
> [2018-09-20 06:04:38] [INFO] connected to database, checking its state
> [2018-09-20 06:04:38] [INFO] connecting to master node of cluster 'pg_cluster'
> [2018-09-20 06:04:38] [INFO] retrieving node list for cluster 'pg_cluster'
> [2018-09-20 06:04:38] [INFO] checking role of cluster node '1'
> [2018-09-20 06:04:38] [INFO] checking cluster configuration with schema 'repmgr_pg_cluster'
> [2018-09-20 06:04:38] [INFO] checking node 2 in cluster 'pg_cluster'
> [2018-09-20 06:04:38] [INFO] reloading configuration file
> [2018-09-20 06:04:38] [INFO] configuration has not changed
> [2018-09-20 06:04:38] [INFO] starting continuous standby node monitoring
> ```
```
这是pgpool日志:
> >>> STARTING SSH (if required)...
> cp: cannot stat '/home/postgres/.ssh/keys/*': No such file or directory
> No pre-populated ssh keys!
> >>> SSH is not enabled!
> >>> TURNING PGPOOL...
> >>> Opening access from all hosts by md5 in /usr/local/etc/pool_hba.conf
> >>> Adding user pcp_user for PCP
> >>> Creating a ~/.pcppass file for pcp_user
> >>> Adding users for md5 auth
> >>>>>> Adding user monkey_user
> >>> Adding check user 'monkey_user' for md5 auth
> >>> Adding user 'monkey_user' as check user
> >>> Adding user 'monkey_user' as health-check user
> >>> Adding backends
> >>>>>> Waiting for backend 0 to start pgpool (WAIT_BACKEND_TIMEOUT=60)
> 2018/09/20 06:03:26 Waiting for host: tcp://pgmaster:5432
> 2018/09/20 06:04:26 Timeout after 1m0s waiting on dependencies to become available: [tcp://pgmaster:5432]
> >>>>>> Will not add node 0 - it's unreachable!
> >>>>>> Waiting for backend 1 to start pgpool (WAIT_BACKEND_TIMEOUT=60)
> 2018/09/20 06:04:26 Waiting for host: tcp://pgslave1:5432
> 2018/09/20 06:05:26 Timeout after 1m0s waiting on dependencies to become available: [tcp://pgslave1:5432]
> >>>>>> Will not add node 1 - it's unreachable!
> >>> Checking if we have enough backends to start
> >>>>>> Will start pgpool REQUIRE_MIN_BACKENDS=0, BACKENDS_COUNT=0
> >>> Configuring /usr/local/etc/pgpool.conf
> >>>>>> Adding config 'num_init_children' with value '250'
> >>>>>> Adding config 'max_pool' with value '4'
> >>> STARTING PGPOOL...
> 2018-09-20 06:05:26: pid 62: LOG: Backend status file /var/log/postgresql/pgpool_status does not exist
> 2018-09-20 06:05:26: pid 62: LOG: Setting up socket for 0.0.0.0:5432
> 2018-09-20 06:05:26: pid 62: LOG: Setting up socket for :::5432
> 2018-09-20 06:05:26: pid 62: LOG: find_primary_node_repeatedly: waiting for finding a primary node
> 2018-09-20 06:05:26: pid 320: FATAL: pgpool is not accepting any new connections
> 2018-09-20 06:05:26: pid 320: DETAIL: all backend nodes are down, pgpool requires at least one valid node
> 2018-09-20 06:05:26: pid 320: HINT: repair the backend nodes and restart pgpool
> 2018-09-20 06:05:26: pid 62: LOG: child process with pid: 320 exits with status 256
> 2018-09-20 06:05:26: pid 62: LOG: fork a new child process with pid: 333
> 2018-09-20 06:06:26: pid 319: FATAL: pgpool is not accepting any new connections
> 2018-09-20 06:06:26: pid 319: DETAIL: all backend nodes are down, pgpool requires at least one valid node
> 2018-09-20 06:06:26: pid 319: HINT: repair the backend nodes and restart pgpool
> 2018-09-20 06:06:26: pid 62: LOG: child process with pid: 319 exits with status 256
> 2018-09-20 06:06:26: pid 62: LOG: fork a new child process with pid: 351
> 2018-09-20 06:07:26: pid 333: FATAL: pgpool is not accepting any new connections
> 2018-09-20 06:07:26: pid 333: DETAIL: all backend nodes are down, pgpool requires at least one valid node
> 2018-09-20 06:07:26: pid 333: HINT: repair the backend nodes and restart pgpool
> 2018-09-20 06:07:26: pid 62: LOG: child process with pid: 333 exits with status 256
> 2018-09-20 06:07:26: pid 62: LOG: fork a new child process with pid: 370
> 2018-09-20 06:08:26: pid 370: FATAL: pgpool is not accepting any new connections
> 2018-09-20 06:08:26: pid 370: DETAIL: all backend nodes are down, pgpool requires at least one valid node
> 2018-09-20 06:08:26: pid 370: HINT: repair the backend nodes and restart pgpool
> 2018-09-20 06:08:26: pid 62: LOG: child process with pid: 370 exits with status 256
> 2018-09-20 06:08:26: pid 62: LOG: fork a new child process with pid: 388
> 2018-09-20 06:09:27: pid 302: FATAL: pgpool is not accepting any new connections
> 2018-09-20 06:09:27: pid 302: DETAIL: all backend nodes are down, pgpool requires at least one valid node
> 2018-09-20 06:09:27: pid 302: HINT: repair the backend nodes and restart pgpool
> 2018-09-20 06:09:27: pid 62: LOG: child process with pid: 302 exits with status 256
> 2018-09-20 06:09:27: pid 62: LOG: fork a new child process with pid: 406
> 2018-09-20 06:10:27: pid 316: FATAL: pgpool is not accepting any new connections
> 2018-09-20 06:10:27: pid 316: DETAIL: all backend nodes are down, pgpool requires at least one valid node
> 2018-09-20 06:10:27: pid 316: HINT: repair the backend nodes and restart pgpool
> 2018-09-20 06:10:27: pid 62: LOG: child process with pid: 316 exits with status 256
> 2018-09-20 06:10:27: pid 62: LOG: fork a new child process with pid: 424
> 2018-09-20 06:11:27: pid 351: FATAL: pgpool is not accepting any new connections
> 2018-09-20 06:11:27: pid 351: DETAIL: all backend nodes are down, pgpool requires at least one valid node
> 2018-09-20 06:11:27: pid 351: HINT: repair the backend nodes and restart pgpool
> 2018-09-20 06:11:27: pid 62: LOG: child process with pid: 351 exits with status 256
> 2018-09-20 06:11:27: pid 62: LOG: fork a new child process with pid: 442
> ``` ```
我认为这是WAL运送的问题,但是它似乎成功克隆了数据库,并且还基于日志进行了注册。这似乎与PGPOOL有关,但我看不到我所缺少的内容。
任何帮助将不胜感激。
谢谢。
答案 0 :(得分:0)
在github问题页面上的czarny94中:
尝试更改/src/pgsql/bin/postgres/primary/entrypoint.sh文件的“ createdb”行。更改以下内容后,与原产地/原产地和矿产之间的差异:
diff --git a/src/pgsql/bin/postgres/primary/entrypoint.sh b/src/pgsql/bin/postgres/primary/entrypoint.sh
index b8451f5..030cbc7 100755
--- a/src/pgsql/bin/postgres/primary/entrypoint.sh
+++ b/src/pgsql/bin/postgres/primary/entrypoint.sh
@@ -3,11 +3,11 @@ set -e
FORCE_RECONFIGURE=1 postgres_configure
...
echo ">>> Creating replication db '$REPLICATION_DB'"
-createdb $REPLICATION_DB -O $REPLICATION_USER
+createdb -U "${POSTGRES_USER}" "${REPLICATION_DB}" -O "${REPLICATION_USER}"
...
-echo "host replication $REPLICATION_USER 0.0.0.0/0 md5" >> $PGDATA/pg_hba.conf
+echo "host replication $REPLICATION_USER 0.0.0.0/0 trust" >> $PGDATA/pg_hba.conf