更新

Question

环境：

Ubuntu14.04 + Postgresql9.4.

以下是我的设置：（' - ＆gt;'表示物理流复制PSR）

Master1 -> Slave1 (primary) -> Slave2

这种行为正确 - Master1上的更改反映在Slave1中，然后是Slave2。

如果我禁用Master1，并使用trigger_file将Slave1升级为Master，则Slave1会成功升级 - 我可以写入Slave1。

但是，新升级的Slave1和Slave2之间的复制停止。

这是预期的行为吗？我期待复制继续这样：

Slave1 -> Slave2

这样对Slave1的写入反映在Slave2

中

更新

日志：

Slave1推广：

2017-10-03 16:43:20 BST  @ LOCATION:  libpqrcv_connect, libpqwalreceiver.c:107
2017-10-03 16:43:25 BST  @ FATAL:  XX000: could not connect to the primary server: could not connect to server: Connection refused
        Is the server running on host "192.168.20.55" and accepting
        TCP/IP connections on port 5432?

2017-10-03 16:43:25 BST  @ LOCATION:  libpqrcv_connect, libpqwalreceiver.c:107
2017-10-03 16:43:30 BST  @ LOG:  00000: trigger file found: /var/lib/postgresql/9.4/main/failover_trigger.5432
2017-10-03 16:43:30 BST  @ LOCATION:  CheckForStandbyTrigger, xlog.c:11440
2017-10-03 16:43:30 BST  @ LOG:  00000: redo done at 0/19000740
2017-10-03 16:43:30 BST  @ LOCATION:  StartupXLOG, xlog.c:7032
2017-10-03 16:43:30 BST  @ LOG:  00000: last completed transaction was at log time 2017-10-03 16:41:23.430752+01
2017-10-03 16:43:30 BST  @ LOCATION:  StartupXLOG, xlog.c:7037
2017-10-03 16:43:30 BST  @ LOG:  00000: selected new timeline ID: 2
2017-10-03 16:43:30 BST  @ LOCATION:  StartupXLOG, xlog.c:7153
2017-10-03 16:43:30 BST  @ LOG:  00000: archive recovery complete
2017-10-03 16:43:30 BST  @ LOCATION:  exitArchiveRecovery, xlog.c:5459
2017-10-03 16:43:30 BST  @ LOG:  00000: MultiXact member wraparound protections are now enabled
2017-10-03 16:43:30 BST  @ LOCATION:  DetermineSafeOldestOffset, multixact.c:2619
2017-10-03 16:43:30 BST  @ LOG:  00000: database system is ready to accept connections
2017-10-03 16:43:30 BST  @ LOCATION:  reaper, postmaster.c:2795
2017-10-03 16:43:30 BST  @ LOG:  00000: autovacuum launcher started
2017-10-03 16:43:30 BST  @ LOCATION:  AutoVacLauncherMain, autovacuum.c:431

SLAVE2

2017-10-03 16:43:30 BST  @ LOG:  00000: replication terminated by primary server
2017-10-03 16:43:30 BST  @ DETAIL:  End of WAL reached on timeline 1 at 0/190007A8.
2017-10-03 16:43:30 BST  @ LOCATION:  WalReceiverMain, walreceiver.c:446
2017-10-03 16:43:30 BST  @ LOG:  00000: fetching timeline history file for timeline 2 from primary server
2017-10-03 16:43:30 BST  @ LOCATION:  WalRcvFetchTimeLineHistoryFiles, walreceiver.c:669
2017-10-03 16:43:30 BST  @ LOG:  00000: record with zero length at 0/190007A8
2017-10-03 16:43:30 BST  @ LOCATION:  ReadRecord, xlog.c:4184
2017-10-03 16:43:30 BST  @ LOG:  00000: restarted WAL streaming at 0/19000000 on timeline 1
2017-10-03 16:43:30 BST  @ LOCATION:  WalReceiverMain, walreceiver.c:374
2017-10-03 16:43:30 BST  @ LOG:  00000: replication terminated by primary server
2017-10-03 16:43:30 BST  @ DETAIL:  End of WAL reached on timeline 1 at 0/190007A8.

Slave1 IP：

192.168.20.56

Slave2 IP：

192.168.20.53

pg_hba.conf允许Slave2连接到Slave1进行复制：

Slave1 pg_hba.conf段：

host    replication     replication     192.168.20.53/32        trust

Slave1 recovery.done：

standby_mode = 'on'
primary_conninfo = 'user=replication host=192.168.20.55 port=5432 sslmode=prefer sslcompression=1 krbsrvname=postgres'
trigger_file = '/var/lib/postgresql/9.4/main/failover_trigger.5432'

Slave2 recovery.conf：

standby_mode = 'on'
primary_conninfo = 'user=replication host=192.168.20.56 port=5432 sslmode=prefer sslcompression=1 krbsrvname=postgres'

非常感谢任何帮助。

更新和解决方案

感谢@Vao Tsun回答，在Slave2 recovery.conf中将recovery_target_timeline设置为'latest'，并重新启动Slave2 postgresql服务器（不重新加载），允许复制过程重启：

standby_mode = 'on'
primary_conninfo = 'user=replication host=192.168.20.56 port=5432 sslmode=prefer sslcompression=1 krbsrvname=postgres'
recovery_target_timeline = 'latest'

Answer 1

你在slave1日志：>

2017-10-03 16:43:30 BST  @ LOG:  00000: selected new timeline ID: 2

和slave2：

017-10-03 16:43:30 BST  @ DETAIL:  End of WAL reached on timeline 1 at 0/190007A8.

所以slave2在升级后没有切换到时间轴2。

正如我在评论中所说，你需要在slave2 recovery.conf

中recovery_target_timeline='latest'

https://www.postgresql.org/docs/current/static/recovery-target-settings.html

recovery_target_timeline（string）指定恢复为特定的时间表。默认设置是沿同一时间线恢复在进行基本备份时是最新的。将此设置为最新恢复到存档中找到的最新时间表，即在备用服务器中很有用。除此之外，你只需要设置它复杂的重新恢复情况中的参数，您需要返回到达一个在时间点恢复后达到的状态。看到第25.3.5节讨论。

Postgresql 9.4级联复制故障转移

环境：

更新