Enhance the stability of detach_false_primary.
authorTatsuo Ishii <ishii@sraoss.co.jp>
Sun, 17 Mar 2024 01:11:04 +0000 (10:11 +0900)
committerTatsuo Ishii <ishii@sraoss.co.jp>
Sun, 17 Mar 2024 01:11:04 +0000 (10:11 +0900)
commit0b94cd9f0d6591e9e5d230f1d5b074297916f023
treef735be8e6ec59d1efe109c12a2b3b18f8fd309d4
parent0bed08065157aa5c19022d22ea291a1a3aab3521
Enhance the stability of detach_false_primary.

It was possible that enabling detach_false_primary caused that all
backend nodes went down.

Suppose watchdog is enabled and there are 3 watchdog nodes pgpool0,
pgpool1 and pgpool2. If pgpool0 and pgpool1 find primary PostgreSQL
goes down due to network trouble between pgpool and PostgreSQL, they
promote a standby node. pgpool2 could find that there are two primary
nodes because the backend status at pgpool2 has not been synced with
pgpool0 and pgpool1, and pgpool2 perform detach_false_primary against
the standby, which is being promoted.

To prevent the issue, now detach_false_primary is performed only by
watchdog leader node. With this, pgpool will not see half baked
backend status and the issue described above will not happen.

Discussion: https://www.pgpool.net/pipermail/pgpool-hackers/2024-February/004432.html
([pgpool-hackers: 4431] detach_false_primary could make all nodes go down)
doc.ja/src/sgml/failover.sgml
doc/src/sgml/failover.sgml
src/streaming_replication/pool_worker_child.c
src/test/regression/tests/081.detach_primary_all_down/test.sh [new file with mode: 0755]