Muhammad Usama [Thu, 14 Jun 2018 08:59:53 +0000 (13:59 +0500)]
Fix for wrong backend roles on standby after the failover
Pgpool standby nodes were getting the "require_backend_sync" signal before the
active/master Pgpool had finish the failover. As a results the standby was
getting the wrong backend node statuses. The cause was a simple coding mistake
where failover indication function was passed with the wrong argument.
Problem reported by Bo Peng <pengbo@sraoss.co.jp>
Bo Peng [Thu, 14 Jun 2018 05:27:53 +0000 (14:27 +0900)]
Add new feature to enable specifying SQL patterns lists that should not be load-balanced.
Even though currently we can do this by adding the
/*NO LOAD BALANCE*/ comment to queries, this requires
modifying application codes and this is not always
possible.
This feature enables specifying SQL patterns lists
that should not be load-balanced.
-------------
New parameter
-------------
black_query_pattern_list = ''
You can specify a semicolon separated list of SQL patterns
that should be sent to primary node only.
SQL that matched patterns specified in this list are not load balanced.
Only Maste Slave mode is supported.
You can use regular expression to match SQL patterns,
to which ^ and $ are automatically added.
When using characters such as "'", ";" or "*" in SQL patterns,
you need to escape them using "\".
Tatsuo Ishii [Tue, 12 Jun 2018 12:53:04 +0000 (21:53 +0900)]
Add "last_status_change" column to "show pool_nodes" command.
The new column indicates the time when "status" or "role" has been
changed. See [pgpool-hackers: 2822] for the reasoning to add the
column.
Probably "last_status_change" should be added to pcp_node_info command
and pgpool_adm functions as well but they are not included in this
commit.
Tatsuo Ishii [Tue, 12 Jun 2018 06:51:00 +0000 (15:51 +0900)]
Revert "Fix 055.backend_all_down test failure."
This reverts commit
65d48c483889d5e1f91898c33b40bc34abc7f0f6.
Tatsuo Ishii [Tue, 12 Jun 2018 06:21:52 +0000 (15:21 +0900)]
Fix 055.backend_all_down test failure.
The test fails because pgpool zombie child process remains. Actually
the failover process is properly performed but when the shutdown
script is executed in background, output to stdout/stderr was blocked,
and this could cause the zombie process syndrome. Soltution is,
redirecting stdout/stderr to /dev/null when sponing the shutdown
script in background.
Bo Peng [Mon, 11 Jun 2018 13:55:25 +0000 (22:55 +0900)]
Add release-notes 3.7.4 - 3.4.18.
Tatsuo Ishii [Thu, 31 May 2018 02:39:09 +0000 (11:39 +0900)]
Fix pgpool hung when query cache enabled in extended query mode.
If replication delay is too much, load balancing is
disabled. Unfortunately query cache module tried to access the backend
even if the target node was already changed to primary even if load
balancing was disabled. To fix this, the target backend is identified
by using pending message data which reflects the fact that load
balancing is disabled.
Bug reported in [pgpool-general-jp: 1534].
To reproduce the bug, following steps are required.
1) create a 2 node streaming replication setting (just running
pgpool_setup will do this),
2) set backend weight 0 to 0 to force the load balance node to node 1.
3) break recovery.conf on node 1 (for example modify conninfo)
4) start the whole cluster.
5) run massive DDL. pgbench -i is convenient for this.
6) make sure that replication delay is too much by using "show
pool_nodes".
7) run following data using pgproto. (you need to adjust lines
including "#" so that # starts beginning of the line)
# Test for disable_load_balance_on_write feature.
#
# Force load balance node to 1.
##backend_weight0 = 0
##backend_weight1 = 1
# Start a transaction
'P' "" "BEGIN" 0
'B' "" "" 0 0 0
'E' "" 0
# Issue SELECT.
'P' "" "SELECT 1"
'B' "" "" 0 0 0
'E' "" 0
# Issue COMMIT
'P' "" "COMMIT" 0
'B' "" "" 0 0 0
'E' "" 0
'S'
'Y'
# Issue same SELECT.
'P' "" "SELECT 1"
'B' "" "" 0 0 0
'E' "" 0
'S'
'Y'
'X'
8) next "SELECT 1" will be hung.
Bo Peng [Wed, 30 May 2018 05:33:03 +0000 (14:33 +0900)]
Fix document typo of PCP commands option "-U".
Tatsuo Ishii [Mon, 28 May 2018 05:19:04 +0000 (14:19 +0900)]
Fix comment typo.
Bo Peng [Thu, 24 May 2018 08:37:43 +0000 (17:37 +0900)]
Delete some debug code.
Tatsuo Ishii [Thu, 24 May 2018 04:31:04 +0000 (13:31 +0900)]
Fix pgpool main process segfault when PostgreSQL 9.5. is used.
pgpool_setup -n 3 (or greater) triggers the bug. While recovering node
2, pgpool main process tried to retrieve version info from backend #2
even if it's not running. This causes the sefault because connection
was not established yet. The reason why PostgreSQL 9.6 or later was not
suffered from the bug was, PostgreSQL exited the loop as soon as the
server version is higher than 9.5. To fix this, call to VALID_BACKEND
macro was added.
Tatsuo Ishii [Thu, 24 May 2018 02:07:35 +0000 (11:07 +0900)]
Do not set writing tx flag with SET TRANSACTION READ ONLY.
In extended query mode, execute() sets the flag upon completion of
writing queries. However the flag was set even when SET TRANSACTION
READ ONLY is issued. Fix this by using
pool_is_transaction_read_only(). This has been already done in simple
query case.
Tatsuo Ishii [Wed, 23 May 2018 08:18:42 +0000 (17:18 +0900)]
Fix wrong parameter passed to failover script.
From 3.7.2, one of the failover script parameters, namely old primary
node was not passed correctly. PRIMARY_NODE_ID macro was used for the
parameter value. Unfortunately it checks the node status since 3.7.2
and gives value 0 if the node is in down status. The node status
could be down if former primary node was going down. To fix this, use
REAL_PRIMARY_NODE_ID macro which returns the current primary node id
regardless the node status.
Problem reported by Pierre Timmermans in [pgpool-general: 6092].
Bo Peng [Mon, 21 May 2018 14:32:35 +0000 (23:32 +0900)]
Fix regression test error.
Tatsuo Ishii [Fri, 18 May 2018 00:20:27 +0000 (09:20 +0900)]
Revert "Improve failover.sh of pgpool_setup to avoid test error."
This reverts commit
22cc8ef69a071a698437a45bbd8336922a639d6c.
Tatsuo Ishii [Thu, 17 May 2018 06:37:58 +0000 (15:37 +0900)]
Clarify that failover_require_consensus requires that health check is enabled.
Bo Peng [Thu, 17 May 2018 00:12:59 +0000 (09:12 +0900)]
Improve failover.sh of pgpool_setup to avoid test error.
Tatsuo Ishii [Tue, 15 May 2018 08:27:13 +0000 (17:27 +0900)]
Fix fail to handle replication slot environment variable.
This causes script errors.
Tatsuo Ishii [Thu, 10 May 2018 02:53:34 +0000 (11:53 +0900)]
Fix pgpool_setup when -r option is used.
It failed to recognize that pg_rewind succeeded and fell back to rsync
mode.
Tatsuo Ishii [Wed, 9 May 2018 08:39:06 +0000 (17:39 +0900)]
Add -r option to pgpool_setup to allow use of pg_rewind.
With this option, pgpool_setup creates basebackup.sh which tries
pg_rewind first. If it fails, falls back to rsync.
Tatsuo Ishii [Wed, 9 May 2018 02:22:32 +0000 (11:22 +0900)]
Update outdated pcp_proc_info manual.
Especially fix for time stamp (which was UNIX epoch, that was changed
long time ago).
Tatsuo Ishii [Tue, 8 May 2018 06:51:04 +0000 (15:51 +0900)]
Fix test.sh in extended_query_test.
It did not respect $PGPOOL_SETUP variable, which points to
pgpool_setup in this source tree.
Also fix error message normalize more complete so that unbale_bind
test does not fail.
Tatsuo Ishii [Mon, 7 May 2018 03:57:45 +0000 (12:57 +0900)]
Downgrade most of DEBUG1 messages to DEBUG5.
This significantly reduces the size of pgpool log when pgpool starts
with -d option (this is equivalent to setting client_min_messages to
debug1).
Per discussion [pgpool-hackers: 2794].
Tatsuo Ishii [Fri, 4 May 2018 22:20:14 +0000 (07:20 +0900)]
Fix pgpool_setup to make replication slots properly.
It only created pgpool_setup_slot1. It should create
pgpool_setup_slot0, pgpool_setup_slot1... pgpool_setup_slotN, where N
= number of nodes -1.
Tatsuo Ishii [Fri, 4 May 2018 08:35:31 +0000 (17:35 +0900)]
Add missing health_check_timeout in pgpool_setup.
Per node health_check_timeout was missing and this should had been
there since the per node health check parameter support was added.
Tatsuo Ishii [Fri, 4 May 2018 03:00:48 +0000 (12:00 +0900)]
Update 1st/2nd stage online recovery doc.
Now 1st/2nd stage online recovery commands accept 5 parameters.
Tatsuo Ishii [Fri, 4 May 2018 02:35:26 +0000 (11:35 +0900)]
Enhance online recovery document.
Clarify that 2nd stage command is only required in native replication
mode.
Tatsuo Ishii [Thu, 26 Apr 2018 08:27:40 +0000 (17:27 +0900)]
Set max_replication_slots when -s option is specified.
Older version of PostgreSQL sets max_replication_slots to 0, which
prevents PostgreSQL from starting up if replication slots are used.
So set max_replication_slots to "number of database clusters + 10".
This should be enough.
Tatsuo Ishii [Thu, 26 Apr 2018 06:59:59 +0000 (15:59 +0900)]
Fix variable substitution bug.
Bo Peng [Wed, 25 Apr 2018 08:04:16 +0000 (17:04 +0900)]
Revert "Install Pgpool-II extension by using latest source code."
This reverts commit
ef2832b4b61cb2f1b1e01a650a178f6eddd05e02.
Bo Peng [Wed, 25 Apr 2018 06:34:03 +0000 (15:34 +0900)]
Install Pgpool-II extension by using latest source code.
Tatsuo Ishii [Wed, 25 Apr 2018 01:50:01 +0000 (10:50 +0900)]
Add pgpool_recovery--1.2.sql.
Forgot in the previous commit.
Tatsuo Ishii [Wed, 25 Apr 2018 01:05:33 +0000 (10:05 +0900)]
Update copyright year.
Tatsuo Ishii [Wed, 25 Apr 2018 01:02:02 +0000 (10:02 +0900)]
Fix compile error.
Tatsuo Ishii [Mon, 23 Apr 2018 04:12:47 +0000 (13:12 +0900)]
Fully implement "disable_load_balance_on_write".
This feature allows to specify the behavior when a write query issued.
Except Japanese documents, all done.
Tatsuo Ishii [Thu, 19 Apr 2018 05:22:20 +0000 (14:22 +0900)]
Fist cut of disable load balance feature.
If write query is issued in an explicit transaction and
disable_load_balance is set to 'trans_transaction', then subsequent
read queries will be redirected to primary node even if the
transaction is closed.
TODO:
- Implement 'always' case. In addition to 'trans_transaction', the
effect persists even outside of explicit transactions until the
session ends.
- Documentation
Tatsuo Ishii [Thu, 19 Apr 2018 05:17:37 +0000 (14:17 +0900)]
Fix bug when USE_REPLICATION_SLOT is not used.
Tatsuo Ishii [Wed, 18 Apr 2018 14:42:52 +0000 (23:42 +0900)]
Fix false primary node detection code.
The previous commit was wrong.
Tatsuo Ishii [Wed, 18 Apr 2018 12:45:07 +0000 (21:45 +0900)]
Enhance figures explaining detach_false_primary.
Tatsuo Ishii [Wed, 18 Apr 2018 06:56:00 +0000 (15:56 +0900)]
Add Japanese doc for detach_false_primary parameter.
Also add figures for the parameter.
Tatsuo Ishii [Wed, 18 Apr 2018 05:54:17 +0000 (14:54 +0900)]
Fix detach_false_primary bug.
Let regard any socket file staring with '/' is a UNIX domain socket
directory.
Tatsuo Ishii [Wed, 18 Apr 2018 02:40:18 +0000 (11:40 +0900)]
Use replication slot if possible.
By setting USE_REPLICATION_SLOT environment variable, now pgpool_setup
in all tests uses replication slots. This reduces disk space under
src/test/regression from 6.3GB to 5,1GB (1.2GB savings).
Tatsuo Ishii [Wed, 18 Apr 2018 02:33:10 +0000 (11:33 +0900)]
Let pgpool_setup recognize an environment varible to turn on "-s" option.
For this purpose new environment variable "USE_REPLICATION_SLOT".
Tatsuo Ishii [Wed, 18 Apr 2018 01:03:37 +0000 (10:03 +0900)]
Prevent pcp_recovery_node from recovering "unused" status node.
This allowed to try to recover a node without configuration data,
which leads to variety of problems. See discussion:
https://www.pgpool.net/pipermail/pgpool-general/2018-March/006021.html
for more details.
Also I fixed pgpool_recovery function so that it quotes an empty
string argument with double quotes. Without this, the argument is
treated as if it does not exist, which was the source of the complain
from the user.
Tatsuo Ishii [Tue, 17 Apr 2018 08:56:53 +0000 (17:56 +0900)]
Update version to 4.0 devel.
Bo Peng [Tue, 17 Apr 2018 08:33:52 +0000 (17:33 +0900)]
Add release notes 3.7.3 - 3.3.21.
Tatsuo Ishii [Tue, 17 Apr 2018 06:10:32 +0000 (15:10 +0900)]
Complete detach_false_primary feature.
In addition to the previous commit:
- Add new config variable detach_false_primary
- Allow to run test along with streaming replication delay checking
- English docs added (Japanese docs needed to be added later)
- Regression test (018.detach_primary) is added
- Sample configuration files are added
- Process reporting is added
Bo Peng [Tue, 17 Apr 2018 05:32:26 +0000 (14:32 +0900)]
Disable health check per node parameters by default.
Tatsuo Ishii [Mon, 16 Apr 2018 02:47:20 +0000 (11:47 +0900)]
Allow to use more than 1 standby in pgpool_setup using replication slot.
For this purpose, add recovery target node argument to pgpool_recovery
extension. So extension version is incremented to 1.2.
Tatsuo Ishii [Sun, 15 Apr 2018 11:32:32 +0000 (20:32 +0900)]
Add support for replication slot to pgpool_setup.
This eliminates the problem when standby is promoted. When a standby
is promoted, it changes the time line in PITR archive, which will stop
other standby if any because of shared archive directory.
Tatsuo Ishii [Thu, 12 Apr 2018 23:00:24 +0000 (08:00 +0900)]
Fix pcp_detach_node hung when -g option is specified.
"pcp_detach_node -g" had been broken since 3.7. The cause was a misuse
of degenerate_backend_set_ex(). Because of this, actual failover
request was not sent to the pgpool main process. As a result,
pcp_worker process waited vainly for a signal arriving from the
process.
Per bug 391. Problem reported by Tomoyuki Sato, fix by me.
Tatsuo Ishii [Mon, 9 Apr 2018 08:44:21 +0000 (17:44 +0900)]
First cut of primary server checking.
For now followings are implemented:
- Check all backend nodes starting node 0.
- If primary nodes appear twice or more, the second one or after are
assumed invalid.
- Such invalid node will be degenerated at the next convenient
time. Currently such timing is at the start up of Pgpool-II. This is
apparently insufficient and should be improved later.
TODO:
- Verify primary nodes using pg_stat_wal_receiver.
- More chances to verify node status. Maybe in the same timing as
streaming replication delay checking?
- Add new GUCs to control of this feature.
Tatsuo Ishii [Mon, 9 Apr 2018 05:45:37 +0000 (14:45 +0900)]
Add new regression test for node 0 is down.
test case 1: node 0 is already down before pgpool starts.
test case 2: node 0 goes down after pgpool starts.
test case 3: node 0 goes down and DISALLOW_TO_FAILOVER flag is set after pgpool starts.
Tatsuo Ishii [Sun, 8 Apr 2018 10:18:36 +0000 (19:18 +0900)]
Make calls to to_regclass fully schema qualified.
This is always recommended way.
Tatsuo Ishii [Thu, 5 Apr 2018 08:11:36 +0000 (17:11 +0900)]
Fix pgpool child process segfault when ALWAYS_MASTER is on.
If following conditions are all met pgpool child segfaults:
1) Streaming replication mode.
2) fail_over_on_backend_error is off.
3) ALWAYS_MASTER flag is set to the master (writer) node.
4) pgpool_status file indicates that the node mentioned in #3 is in
down status.
What happens here is,
1) find_primary_node() returns node id 0 without checking the status
of node 0 since ALWAYS_MASTER is set. It's remembered as the
primary node id. The node id is stored in Req_info->primary_node_id.
2) The connection to backend 0 is not created since pgpool_status says
it's in down status.
3) upon starting of session, select_load_balancing_node () is called
and it tries to determine the database name from client's start up
packet.
4) Since MASTER_CONNECTION macro points to the PRIMARY_NODE,
MASTER_CONNECTION(ses->backend) is NULL and it results in a segfault.
The fix is, to change PRIMARY_NODE_ID macro so that it returns
REAL_MASTER_NODE_ID (that is the youngest node id which is alive) if
the node id in Req_info->primary_node_id is in down status. This can
be checked using VALID_BACKEND_RAW macro.
VALID_BACKEND macro cannot be used here because it calls
pool_is_node_to_be_sent_in_current_query() inside. Problem is, when a
query is about to processed, pool_is_query_in_progress() is already
set but pool_is_node_to_be_sent() could return false because
where_to_send member in the query context may not be is set yet
(that's the cause of the enbug in Pgpool-II 3.7.2).
So we have the "true" primary node id in Req_info->primary_node_id,
and "fake" primary node id returned by PRIMARY_NODE_ID macro.
See [pgpool-hackers: 2687] and [pgpool-general: 5881] Pgpool-3.7.1
segmentation fault for more details.
Since ALWAYS_MASTER flag was introduced in 3.7, back pached to 3.7
only.
Per bug report from Philip Champon.
Tatsuo Ishii [Wed, 4 Apr 2018 02:17:02 +0000 (11:17 +0900)]
Improve watchdog documents.
Tatsuo Ishii [Mon, 2 Apr 2018 22:38:18 +0000 (07:38 +0900)]
More typo fix.
Tatsuo Ishii [Sat, 31 Mar 2018 21:39:39 +0000 (06:39 +0900)]
Update config README file.
Tatsuo Ishii [Fri, 30 Mar 2018 08:35:01 +0000 (17:35 +0900)]
More description added.
Tatsuo Ishii [Fri, 30 Mar 2018 08:04:22 +0000 (17:04 +0900)]
Add a document for adding new config parameter.
Bo Peng [Tue, 20 Mar 2018 08:46:35 +0000 (17:46 +0900)]
Improve test script 003.failover.
Tatsuo Ishii [Wed, 14 Mar 2018 08:19:18 +0000 (17:19 +0900)]
Deal with "unable to bind D cannot get parse message "S1" error.
Before this just raised an exception and issue "DISCARD ALL" to only
node 0 if load balance node = 0 because query context wants to do
so. Problem is, when an exception is raised, the query context is not
active any more and Pgpool-II tries to read from node 1, which causes
a hang up.
Add new test for this to the extended query test.
Bo Peng [Wed, 14 Mar 2018 05:01:52 +0000 (14:01 +0900)]
Fix some test errors.
Add "wait_for_pgpool_startup" to wait for Pgpool-II starting.
Tatsuo Ishii [Fri, 2 Mar 2018 05:22:35 +0000 (14:22 +0900)]
Mention that users can avoid failover using backend_flag even PostgreSQL admin shutdown.
Tatsuo Ishii [Fri, 2 Mar 2018 05:22:18 +0000 (14:22 +0900)]
Change version to 3.8 devel.
Bo Peng [Wed, 28 Feb 2018 08:46:54 +0000 (17:46 +0900)]
Fix document typos.
Tatsuo Ishii [Tue, 27 Feb 2018 05:04:20 +0000 (14:04 +0900)]
Add new regression test for node 0 not being primary.
Tatsuo Ishii [Tue, 27 Feb 2018 04:29:38 +0000 (13:29 +0900)]
Fix failure in replication mode.
If .psqlrc exists, pgpool_seup for replication mode fails because psql
produces messages like "Pager usage is off." which in turn confuses
a command after a pipe. Fix is add -q option to psql.
Tatsuo Ishii [Tue, 27 Feb 2018 04:22:15 +0000 (13:22 +0900)]
Allow to support pgpool_switch_xlog PostgreSQL 10.
Since PostgreSQL 10, pgpool_switch_xlog used in the recovery second
stage fails due to function name changes in PostgreSQL 10.
Tatsuo Ishii [Mon, 26 Feb 2018 08:01:50 +0000 (17:01 +0900)]
Revert "Fix pgpool child process segfault when ALWAYS_MASTER is on."
This reverts commit
9022ff842fb5dbbe06e2f2f4cf38fadf47b592da.
With the commit, write queries are always sent to node 0 even if the
primary node is not 0 because PRIMARY_NODE_ID macro returns
REAL_MASTER_NODE_ID, which is usually 0. Thus write queries are failed
with:
ERROR: cannot execute INSERT in a read-only transaction
Tatsuo Ishii [Mon, 19 Feb 2018 05:44:39 +0000 (14:44 +0900)]
Add description about temporarily installation for the test.
Tatsuo Ishii [Mon, 19 Feb 2018 05:37:11 +0000 (14:37 +0900)]
Allow to test using temporary installation.
Most necessary stuffs for this was stolen fro regress.sh.
Tatsuo Ishii [Mon, 19 Feb 2018 05:19:30 +0000 (14:19 +0900)]
Enhance extended query test.
Add extra_scripts directory to include extra scripts to be executed
after main test script (tests) run. Currently only scripts for
parse-before-bind.data parse-before-bind-2.data exist so that the test
confirm the load balancing behavior.
Also some tests are fixed so that they can run individually in that a
table for testing is dropped at the end of the test and test results
are not affected by the existence of the table.
Tatsuo Ishii [Wed, 14 Feb 2018 09:05:13 +0000 (18:05 +0900)]
Fix pgpool_adm family functions examples.
Wrong function names are used.
Tatsuo Ishii [Wed, 14 Feb 2018 05:08:59 +0000 (14:08 +0900)]
Start 3.8 development.
- Set version string to "3.8devel".
- Set new code name "torokiboshi".
- Regenerate gram.[ch].
- Enable maintainer mode in configure.ac.
Bo Peng [Tue, 13 Feb 2018 05:06:22 +0000 (14:06 +0900)]
Add release-notes 3.7.2 - 3.3.20.
Bo Peng [Mon, 12 Feb 2018 14:43:07 +0000 (23:43 +0900)]
Merge branch 'master' of ssh://git.postgresql.org/pgpool2
Bo Peng [Mon, 12 Feb 2018 14:41:04 +0000 (23:41 +0900)]
Fix typos.
Bo Peng [Mon, 12 Feb 2018 14:29:13 +0000 (23:29 +0900)]
Fix figures mistakes.
Tatsuo Ishii [Sat, 10 Feb 2018 11:13:16 +0000 (20:13 +0900)]
Allow to build with libressl.
Per [pgpool-hackers: 2714].
Patch by Sandino Araico Sanchez.
Tatsuo Ishii [Fri, 9 Feb 2018 04:18:50 +0000 (13:18 +0900)]
Fix writing transaction flag is accidentally set at commit or rollback.
We set writing transaction flag if it's a write query while processing
an execute message. However, the flag is set even it's a commit or
rollback. This is an oversight. The flag is reset while starting next
transaction anyway, so it's actually harmless but a bug is a bug.
Muhammad Usama [Thu, 1 Feb 2018 14:55:16 +0000 (19:55 +0500)]
Throw a warning message when failover consensus settings on watchdog nodes differs.
Bo Peng [Wed, 31 Jan 2018 02:39:37 +0000 (11:39 +0900)]
Fix document typo.
Tatsuo Ishii [Mon, 29 Jan 2018 10:13:21 +0000 (19:13 +0900)]
Fix bug with socket writing.
pool_write_flush() is responsible for writing to sockets when pgpool's
write buffer is full (this function was introduced in 3.6.6 etc). When
network write buffer in kernel is full, it does retrying but it forgot
to update the internal buffer pointer. As a result, broken data is
written to the socket. This results in variety of problems including
too large message length.
Tatsuo Ishii [Mon, 29 Jan 2018 04:53:18 +0000 (13:53 +0900)]
Set TCP_NODELAY and non blocking to frontend socket.
TCP_NODELAY is employed by PostgreSQL, so do we it.
Listen fd is set to non blocking. To make sure accept fd is set to non
blocking.
Tatsuo Ishii [Mon, 29 Jan 2018 04:04:58 +0000 (13:04 +0900)]
Fix pgpool child process segfault when ALWAYS_MASTER is on.
If following conditions are all met pgpool child segfaults:
1) Streaming replication mode.
2) fail_over_on_backend_error is off.
3) ALWAYS_MASTER flags is set to the master (writer) node.
4) pgpool_status file indicates that the node mentioned in #3 is in
down status.
What happens here is,
1) find_primary_node() returns node id 0 without checking the status
of node 0 since ALWAYS_MASTER is set. It's remembered as the
primary node id. The node id is stored in Req_info->primary_node_id.
2) The connection to backend 0 is not created since pgpool_status says
it's in down status.
3) upon starting of session, select_load_balancing_node () is called
and it tries to determine the database name from client's start up
packet.
4) Since MASTER_CONNECTION macro points to the PRIMARY_NODE,
MASTER_CONNECTION(ses->backend) is NULL and it results in a segfault.
The fix I propose is, to change PRIMARY_NODE_ID macro so that it
returns REAL_MASTER_NODE_ID (that is the youngest node id which is
alive) if the node id in Req_info->primary_node_id is in down status.
So we have the "true" primary node id in Req_info->primary_node_id,
and "fake" primary node id returned by PRIMARY_NODE_ID macro.
See [pgpool-hackers: 2687] and [pgpool-general: 5881] Pgpool-3.7.1
segmentation fault for more details.
Since ALWAYS_MASTER flag was introduced in 3.7, back pached to 3.7
only.
Per bug report from Philip Champon.
Tatsuo Ishii [Tue, 23 Jan 2018 23:01:22 +0000 (08:01 +0900)]
Fix segfault when %a is in log_line_prefix and debug message is on.
log_line_prefix() gets called to create a log line prefix string. If
"%a" is specified in "log_line_prefix" parameter, log_line_prefix()
calls MASTER_CONNECTION macro, which calls
pool_virtual_master_db_node_id(), which calls ereport(), which calls
log_line_prefix() if debug message is on. This leads to an infinite
recursion and a segfault. Fix is, calling MASTER_NODE_ID macro instead
of MASTER_CONNECTION macro.
Per bug 376.
Tatsuo Ishii [Tue, 23 Jan 2018 04:24:40 +0000 (13:24 +0900)]
Fix per node health check parameters.
Some them were string types, that should have been integer types.
Bo Peng [Fri, 19 Jan 2018 05:00:27 +0000 (14:00 +0900)]
Change systemd service file to use STOP_OPTS=" -m fast".
Bo Peng [Fri, 19 Jan 2018 04:58:44 +0000 (13:58 +0900)]
Change pgpool_setup to add restore_command in recovery.conf.
Tatsuo Ishii [Thu, 18 Jan 2018 13:14:38 +0000 (22:14 +0900)]
Fix queries hanging in parse_before_bind with extended protocol and replication + load-balancing.
In case the client sends a BIND message for a query
that has not yet been parsed by the executing node,
the PARSE will be executed before attempting to BIND
the parameters.
However, during the execution of the PARSE, the session
context is not set to in_progress, which leads to wrong
backend validity tests in read_kind_from_backend which
in turn makes the process wait on a backend which is not
going to send anything.
Fixes bug #377.
Problem analysis and fix by Ancoron Luciferis and me.
Tatsuo Ishii [Wed, 10 Jan 2018 08:29:58 +0000 (17:29 +0900)]
Fix comment typo.
Bo Peng [Mon, 8 Jan 2018 06:46:34 +0000 (15:46 +0900)]
Add 3.7.1 - 3.3.19 release-notes.
Bo Peng [Mon, 8 Jan 2018 06:13:11 +0000 (15:13 +0900)]
Improve Makefiles.
Patch provided by Tomoaki Sato.
Bo Peng [Sat, 6 Jan 2018 07:16:17 +0000 (16:16 +0900)]
Fix document typo.
Tatsuo Ishii [Fri, 22 Dec 2017 06:20:13 +0000 (15:20 +0900)]
Replace /bin/ed with /bin/sed.
This change requires less packages in order to install pgpool_setup,
because /bin/sed is included in most distribution's base packages,
while //bin/ed is not.
Bo Peng [Thu, 21 Dec 2017 05:42:39 +0000 (14:42 +0900)]
Change the pgpool.service and sysconfig files to output Pgpool-II log.
Removeing "Type=forking" and add OPTS=" -n" to
run Pgpool-II with non-daemon mode, because we need to redirect logs.
Using "journalctl" command to see Pgpool-II systemd log.
Bo Peng [Thu, 21 Dec 2017 03:34:05 +0000 (12:34 +0900)]
Fix some document errors.
Tatsuo Ishii [Tue, 19 Dec 2017 08:33:31 +0000 (17:33 +0900)]
Add documentation for SGML document build.
Tatsuo Ishii [Tue, 19 Dec 2017 01:09:46 +0000 (10:09 +0900)]
Fix per node health check parameters ignored.
Per bug 371. Back patch to the 3.7 stable tree where the feature was
introduced.
Also pgpool_setup is modified to add appropriate per node health check
parameters to pgpool.conf. This is necessary because
pgpool.conf.sample sets health_check_user0 to 'nobody', which
immediately causes health check failure on DB node 0.