There is a long standing bug with native replication mode. As reported
in pgpool-general, it is possible to lost sync of database if slave
DB's postgres process is killed. This is due to an oversight in
read_packets_and_process().
In replication mode if slave server's postgres is killed, then local
backend status is set to down.
*(my_backend_status[i]) = CON_DOWN;
So next DDL/DML in the same session is only issued to master node (and
other slave if there are multiple slave nodes). Of course this leads
to serious data inconsistency problem because in native replication
mode all DB nodes must receive DDL/DML at the same time.
Fix is triggering failover in thiscase.
Discussions:
https://www.pgpool.net/pipermail/pgpool-general/2020-March/006954.html
https://www.pgpool.net/pipermail/pgpool-hackers/2020-March/003540.html
break;
}
+ /*
+ * In native replication mode, we need to trigger failover
+ * to avoid data inconsistency.
+ */
+ else if (REPLICATION)
+ {
+ was_error = 1;
+ if (!VALID_BACKEND(i))
+ break;
+ notice_backend_error(i, REQ_DETAIL_SWITCHOVER);
+ sleep(5);
+ }
+
/*
* Just set local status to down.
*/