Tatsuo Ishii [Fri, 4 Oct 2019 04:52:19 +0000 (13:52 +0900)]
Fix inappropriate ereport call in VALID_BACKEND.
VALID_BACKEND (more precisely pool_virtual_master_db_node_id) macro
emitted message if pgpool is performing failover/failback:
ereport(WARNING,
(errmsg("failover/failback is in progress"),
errdetail("executing failover or failback on backend"),
errhint("In a moment you should be able to reconnect to the database")));
This could be called within signal handlers and
POOL_SETMASK(&BlockSig)/POOL_SETMASK(&UnBlockSig) was called to block
an interrupt because ereport is not reentrant. However it is possible
that callers have already called POOL_SETMASK, and this could result
unwanted signal unblock.
Fix is, removing ereport and POOL_SETAMASK all together. This results
in removing the message above but we have no choice.
I found the problem while investigating regression
055.backend_all_down failure but of course the bug could have bitten
users in other places.
Muhammad Usama [Thu, 3 Oct 2019 14:53:44 +0000 (19:53 +0500)]
Fix for Coverity warnings in watchdog and lifecheck
Tatsuo Ishii [Thu, 3 Oct 2019 12:33:09 +0000 (21:33 +0900)]
Fix signal unblock leak in failover.
When failover event occurs, register_node_operation_request() gets
called to en-queue failover/failback requests. If the request queue is
full, this function returns false with unlocking semaphore. But it
forgot to unblock signal mask. This leads to block all signals
including SITERM, which makes pgpool fail to shutdown.
Discussion: https://www.pgpool.net/pipermail/pgpool-hackers/2019-October/003449.html
Muhammad Usama [Sat, 28 Sep 2019 20:15:11 +0000 (01:15 +0500)]
Fix for bug-545: Quorum lost and not recovered
Master watchdog node was not adding the lost standby node to its list of valid
standby nodes after it is rediscovered by the lifecheck.The fix is to ask the
node to rejoin the master node when it gets rediscovered by the lifecheck.
As part of this commit, I have also added the watchdog data version and Pgpool-II
version in the watchdog info packet to make the extensions in the watchdog
messages easier in the future.
Thanks to Guille(reporter of this bug), for providing lots of help in testing the fix
Tatsuo Ishii [Tue, 6 Aug 2019 02:27:30 +0000 (11:27 +0900)]
Overhaul health check debug facility.
check_backend_down_request() in health_check.c is intended to simulate
the situation where communication failure between health check and
PostgreSQL backend node by creating a file containing lines:
1 down
where the first numeric is the node id starting from 0, tab, and
"down". When health check process finds the file, let health check
fails on node 1.
After health check brings the node into down status,
check_backend_down_request() change "down" to "already_down" to
prevent repeating node failure.
However, questions is, this is necessary at all. I think
check_backend_down_request() should keep on reporting the down status
and it should be called inside establish_persistent_connection() to
prevent repeating node failure because it could be better simulated
the failing situation in this way. For example, currently the health
check retry is not simulated but the new way can do it.
Moreover, in current watchdog implementation, to bring a node into
quarantine state requires *two" times of node communication error
detection. Since check_backend_down_request() only allows to raise
node down even *once" (after the down state is changed to already_down
state), it's impossible to test the watchdog quarantine using
check_backend_down_request(). I changed check_backend_down_request()
so that it continues to raise "down" event as long as the down request
file exists.
This commit enhances check_backend_down_request() as described above.
1) caller of check_backend_down_request() is
establish_persistent_connection(), rather than
do_health_check_child().
2) check_backend_down_request() does not change "down" to
"already_down" anymore. This means that the second argument of
check_backend_down_request() is not useful anymore. Probably I
should remove the argument later on.
Tatsuo Ishii [Wed, 25 Sep 2019 05:22:21 +0000 (14:22 +0900)]
Fix memory leak in replication mode.
Per coverity.
Tatsuo Ishii [Tue, 24 Sep 2019 23:49:48 +0000 (08:49 +0900)]
Fix memory leak while attempting to connect to backend.
If no backend is up and running, memory for copy of startup packet
will be lost. This was brought by commit
cdb49d3b7. Per coverity.
Tatsuo Ishii [Wed, 18 Sep 2019 01:51:28 +0000 (10:51 +0900)]
Fix uninitialized variable.
Per Coverity.
Tatsuo Ishii [Tue, 17 Sep 2019 23:07:27 +0000 (08:07 +0900)]
Fix compiler warnings.
Tatsuo Ishii [Tue, 17 Sep 2019 22:39:15 +0000 (07:39 +0900)]
Fix compiler warning.
Tatsuo Ishii [Mon, 16 Sep 2019 22:21:25 +0000 (07:21 +0900)]
Revert "Fix occasional query hang while processing DEALLOCATE."
This reverts commit
8925e047fb42ae758966efa8eade47f6f8fbc41c.
Tatsuo Ishii [Mon, 16 Sep 2019 00:24:08 +0000 (09:24 +0900)]
Fix occasional query hang while processing DEALLOCATE.
When DEALLOCATE tries to remove a named statement, it inherits
where_to_send map of the named statement in
where_to_send_deallocate(). However it forgot to copy the load balance
node id in the query context of the named statement. This made sending
query to backend not happen: if the target node id is different from
query_context->load_balance_node_id nor primary node id,
pool_virtual_master_db_node_id (it is called as MASTER_NODE_ID)
returns primary node id, and pool_send_and_wait(MASTER_NODE_ID)
ignores the request because VALID_BACKEND returns false in this case
(MASTER_NODE_ID = primary node id is not in the where_to_send map). As
a result, following check_error() waits for response from backend in
vain.
Fix is, let where_to_send_deallocate() copy load balance node id from
the query context of the previous named statement.
Per bug 546.
Tatsuo Ishii [Sun, 15 Sep 2019 13:39:18 +0000 (22:39 +0900)]
Fix segfault in certain case.
The scenario is something like:
1) a named statement is created.
2) DEALLOCATE removes it.
3) an erroneous query is executed.
In #2, "sent message" for the named statement is removed but
"uncompleted_message" is left. Then after #3, in ReadyForQuery()
uncompleted_message is added and removed. However, storage for the
uncompleted_message has been already freed in #2, and it causes a
segfault.
Fix is, in SimpleQuery() set NULL to uncompleted_message if it's not
PREPARE command so that ReadyForQuery() does not try to remove the
already removed message.
Per bug 546.
Here is a minimum test case.
'P' "_plan0x7f2d465db530" "SELECT 1" 0
'S'
'Y'
'Q' "DEALLOCATE _plan0x7f2d465db530"
'Y'
'Q' "CREATE INDEX users_auth_id_index ON non_existing_table ( auth_id )"
'Y'
'X'
Tatsuo Ishii [Thu, 12 Sep 2019 04:40:05 +0000 (13:40 +0900)]
Fix identical code used for different branches per Coverity.
Tatsuo Ishii [Thu, 12 Sep 2019 04:39:41 +0000 (13:39 +0900)]
Fix memory leak per Coverity.
Tatsuo Ishii [Tue, 10 Sep 2019 06:54:13 +0000 (15:54 +0900)]
Fix typo in fork_lifecheck_child().
Tatsuo Ishii [Tue, 10 Sep 2019 06:42:10 +0000 (15:42 +0900)]
Fix typo in fork_watchdog_child().
Tatsuo Ishii [Fri, 6 Sep 2019 07:31:07 +0000 (16:31 +0900)]
Fix memory leak.
Per Coverity.
Tatsuo Ishii [Fri, 6 Sep 2019 06:54:39 +0000 (15:54 +0900)]
Fix memory leak.
Per Coverity.
Tatsuo Ishii [Fri, 6 Sep 2019 06:24:09 +0000 (15:24 +0900)]
Fix uninitialized variable.
Probably harmless but bug is bug...
Per Coverity.
Tatsuo Ishii [Tue, 3 Sep 2019 22:45:17 +0000 (07:45 +0900)]
Doc: mention that VIP will not be brougt up if quorum does not exist.
Tatsuo Ishii [Sun, 1 Sep 2019 02:38:35 +0000 (11:38 +0900)]
Fix pgpool_setup to reflect the -p (baseport) to ORIGBASEPORT variable.
Otherwise, shutdown generated script by pgpool_setup does not use
proper port number for netstat command.
Bo Peng [Mon, 26 Aug 2019 06:51:22 +0000 (15:51 +0900)]
Doc: Fix missing documents from previous commit.
Tatsuo Ishii [Sun, 25 Aug 2019 02:37:35 +0000 (11:37 +0900)]
Doc: fix indentation in scripts.
Auto indentation by commit
2cb0bd3f8f236aeacfba37cd4d604893561bad52
broke indentation of scripts in <programlisting> tag.
Tatsuo Ishii [Sun, 25 Aug 2019 01:15:37 +0000 (10:15 +0900)]
Doc: fix typo in "What is Pgpool-II?" section.
Author: Alejandro Roman
Discussion: https://www.pgpool.net/pipermail/pgpool-hackers/2019-August/003392.html
Bo Peng [Fri, 16 Aug 2019 02:51:36 +0000 (11:51 +0900)]
Add "-I" option to "arping_cmd" command default setting.
Tatsuo Ishii [Thu, 15 Aug 2019 22:35:25 +0000 (07:35 +0900)]
Doc: run auto indent using emacs.
Here is the emacs script F.Y.I.
;; must be run by emacs
(load "/home/t-ishii/.emacs.d/init.el")
(find-file (nth 0 command-line-args-left));
(indent-region (point-min) (point-max));
(save-buffer)
Bo Peng [Thu, 15 Aug 2019 07:58:21 +0000 (16:58 +0900)]
Doc: update doc version.
Bo Peng [Thu, 15 Aug 2019 05:16:50 +0000 (14:16 +0900)]
Test: rename test script.
Bo Peng [Thu, 15 Aug 2019 04:58:27 +0000 (13:58 +0900)]
Doc: fix incorrect link name.
Bo Peng [Thu, 15 Aug 2019 04:38:30 +0000 (13:38 +0900)]
Prepare 3.7.11.
Bo Peng [Thu, 15 Aug 2019 04:30:58 +0000 (13:30 +0900)]
Doc: update release-note.
Bo Peng [Thu, 15 Aug 2019 01:29:44 +0000 (10:29 +0900)]
Doc: add 3.7.11-3.4.25 release-note.
Tatsuo Ishii [Wed, 14 Aug 2019 00:14:36 +0000 (09:14 +0900)]
Fix memory leak.
Pointed out by Coverity.
Tatsuo Ishii [Thu, 8 Aug 2019 02:38:02 +0000 (11:38 +0900)]
Make waiting for TIME_WAIT in pgpool_setup optional.
Since commit
3b32bc4e583da700cc8df7c5777e90341655ad3b the shutdownall
script generated by pgpool_setup waits for Pgpool-II socket in
TIME_WAIT state disappeared. However in most cases this takes long
time and it makes uncomfortable for developer's testing works.
This commit makes the wait to be optional: unless environment variable
"CHECK_TIME_WAIT" is set to other than "false", it never waits for the
TIME_WAIT state.
Tatsuo Ishii [Fri, 9 Aug 2019 08:04:28 +0000 (17:04 +0900)]
Fix "unable to bind. cannot get parse message" error.
This was caused by too-eager memory free in parse_before_bind. It
called
pool_remove_sent_message/pool_create_sent_message/pool_add_sent_message
combo to replace the query context in the sent message. Unfortunately
pool_remove_sent_message free memory such as statement name, which was
being passed by caller. As a result, the new sent message created by
pool_create_sent_message pointed to freed statement name, which may
make a search by statement name fail because now the statement name in
the sent message points to freed memory area, which might be
overwritten by later memory allocation. Fix is, instead of calling
pool_remove_sent_message etc., just replace the query context in the
sent message.
Per bug 531.
Muhammad Usama [Thu, 8 Aug 2019 14:03:40 +0000 (19:03 +0500)]
Fix for
0000483: online-recovery is blocked after a child process exits ...
The problem is if some child process exits abnormally during the second stage
of online recovery, then the connection counter that keeps the track of exiting
processes does not get decremented and Pgpool-II keeps waiting for the exit of
the already exited process. Eventually, the recovery fails after
client_idle_limit_in_recovery expires.
The fix for this issue is to set the connection counter to zero when
client_idle_limit_in_recovery is enabled and it has less value than
recovery_timeout, Since all clients must have been kicked out by the time
when client_idle_limit_in_recovery expires.
A similar fix is already committed as part of bug 431 by Tatsuo Ishii, So this
commit basically imports the same logic in the watchdog function that processes
the remote online recovery requests.
Apart from the above-mentioned change, Hoshiai San identified that the watchdog
IPC command timeout for the online recovery start function executed through watchdog
is set exactly to the same as recovery_timeout which needs to be increased to
make the solution work correctly.
Muhammad Usama [Wed, 15 May 2019 21:36:35 +0000 (02:36 +0500)]
Fix for [pgpool-hackers: 3295] duplicate failover request ...
Pgpool should keep the backend health check running on quarantined nodes so
that when the connectivity resumes, they should automatically get removed
from the quarantine. Otherwise the temporary network glitch could send the node
into permanent quarantine state.
Muhammad Usama [Wed, 7 Aug 2019 15:22:01 +0000 (20:22 +0500)]
Fix for no primary on standby pgpool when primary is quarantined on master
Master watchdog Pgpool sends primary_node_id = -1 in the backend status sync
message if the primary node is quarantined on it. So standby watchdog Pgpool
must not update its primary_node_id if the primary backend node id in sync
message is invalid_node_id (-1) while the same sync message reports the
backend status of the current primary node as "NOT DOWN".
The issue was reported by "Tatsuo Ishii <ishii@sraoss.co.jp>" and fixed by me
Tatsuo Ishii [Thu, 8 Aug 2019 02:02:50 +0000 (11:02 +0900)]
Import some of memory manager debug facilities from PostgreSQL.
Now we can use CLOBBER_FREED_MEMORY, which is useful to detect
accesses to already pfreed memory.
Takuma Hoshiai [Mon, 29 Jul 2019 06:08:53 +0000 (15:08 +0900)]
Fix watchdog_setup command option
The mode option is incorrectly. when pgpool_setup command is called by
watchdog_setup command, mode option forget to set.
Tatsuo Ishii [Sun, 28 Jul 2019 02:11:07 +0000 (11:11 +0900)]
Fix pgpool_setup to produce correct follow master command.
The produced script incorrectly checked whether PostgreSQL is running
or not, which resulted in that it mistakenly thought PostgreSQL is
always running.
Bo Peng [Thu, 25 Jul 2019 00:19:53 +0000 (09:19 +0900)]
Fix regression test errors.
Bo Peng [Wed, 24 Jul 2019 12:19:26 +0000 (21:19 +0900)]
Use pg_get_expr() instead of pg_attrdef.adsrc to support for PostgreSQL 12.
Since PostgreSQL 12 removed pg_attrdef.adsrc, use pg_get_expr() instead of pg_attrdef.adsrc if the backend version is 7.3 or later.
Thanks to Takuma Hoshiai for creating the patch.
Tatsuo Ishii [Wed, 17 Jul 2019 07:51:31 +0000 (16:51 +0900)]
Fix the failover() so that it does not access out of array.
Per Coverity.
Tatsuo Ishii [Wed, 17 Jul 2019 07:48:37 +0000 (16:48 +0900)]
Enhance shutdown script of pgpool_setup.
I observe occasional regression test failure caused by bind error to
the TCP/IP port. This fix tries to confirm usage of the TCP/IP port
while executing shutdown script using netstat command.
Tatsuo Ishii [Tue, 16 Jul 2019 06:21:10 +0000 (15:21 +0900)]
Backport Pgversion().
Tatsuo Ishii [Sun, 7 Jul 2019 13:58:35 +0000 (22:58 +0900)]
Fix possible out of array index access.
It was pointed out by Coverity that node_id could be -1.
Tatsuo Ishii [Sun, 7 Jul 2019 01:09:25 +0000 (10:09 +0900)]
Fix query cache module so that it checks oid array's bound.
Tatsuo Ishii [Sat, 6 Jul 2019 23:08:25 +0000 (08:08 +0900)]
Fix off-by-one error in query cache module.
When debug print is enabled, it might had tried to access out of bound
of oid array.
Tatsuo Ishii [Fri, 5 Jul 2019 05:32:43 +0000 (14:32 +0900)]
Allow health check process to reload pgpool.conf.
When separate health check process was introduced, we forgot to send
signal to the health check process when pgpool.conf reload is
requested.
Tatsuo Ishii [Wed, 3 Jul 2019 03:59:11 +0000 (12:59 +0900)]
Bug525: Fix sefault when query cache is enabled.
When query cache is enabled,
session_context->query_context->skip_cache_commit flag was set or
reset while processing execute message. Problem was, it was done
before session_context->query_context was set. So fix is just
set/reset
query_context->skip_cache_commit. session_context->query_context is
set later on anyway.
Per bug 525.
Tatsuo Ishii [Tue, 2 Jul 2019 09:40:11 +0000 (18:40 +0900)]
Make shutdownall to wait for completion of shutdown of Pgpool-II.
It was observed that regression test occasionally failed because
previous does not completely finished before next test started. To fix
the problem, make shutdownall script generated by pgpool_setup to wait
for completion of shutdown of Pgpool-II.
Tatsuo Ishii [Tue, 2 Jul 2019 00:08:57 +0000 (09:08 +0900)]
Down grade LOG to DEBUG5 in sent message module.
The log was added in commit
56a6b6a72, but in some cases it is
disturbing users.
Discussion: [pgpool-general: 6620] Fwd: A lot of "checking zapping sent message" in log
Tatsuo Ishii [Mon, 24 Jun 2019 13:13:18 +0000 (22:13 +0900)]
Fix mistake introduced in the previous commit.
Tatsuo Ishii [Mon, 24 Jun 2019 01:57:34 +0000 (10:57 +0900)]
Fix segfault when "samenet" is specified in pool_hba.conf.
When "samenet" is specified, SockAddr_cidr_mask(struct
sockaddr_storage *mask, char *numbits, int family) gets called with
numbits == NULL. However the function was not prepared for
it. Originally the function was imported from PostgreSQL. When the bug
was fixed in PostgreSQL, unfortunately the fix was not applied to
Pgpool-II. This commit applies the same fix as PostgreSQL.
Discussion: [pgpool-general: 6601] Pgpool-II + hba + samenet = segfault in libc-2.24.so
Bo Peng [Thu, 20 Jun 2019 03:53:40 +0000 (12:53 +0900)]
doc: Fix documentation typos.
Bo Peng [Wed, 19 Jun 2019 02:04:02 +0000 (11:04 +0900)]
doc: Fix documentation typo.
Bo Peng [Wed, 19 Jun 2019 01:51:46 +0000 (10:51 +0900)]
doc: Fix documentation errors in follow_sh script.
Tatsuo Ishii [Tue, 11 Jun 2019 05:10:47 +0000 (14:10 +0900)]
Merge branch 'V3_7_STABLE' of ssh://git.postgresql.org/pgpool2 into V3_7_STABLE
Tatsuo Ishii [Tue, 11 Jun 2019 04:47:42 +0000 (13:47 +0900)]
Fix health check process is not shutting down in certain cases.
When watchdog detects fatal events, including not reaching to
trusted_servers, watchdog suicides with POOL_EXIT_FATAL exit status
code. In this case the parent of watchdog, the pgpool main process's
SIGCHILD handler reaper() exits and on_exit call back calls
system_will_go_down(), which in turn calls terminate_all_children().
Problem is, terminate_all_children() forgot to kill health check
process. This commit fixes that.
Also there are some not well behaving codings are enhanced.
Back patched to 3.7, when the bug was introduced.
Bo Peng [Fri, 7 Jun 2019 08:19:37 +0000 (17:19 +0900)]
Fix to deal with backslashes according to the config of standard_conforming_strings
in native replication mode.
per bug467.
Tatsuo Ishii [Sun, 2 Jun 2019 02:40:40 +0000 (11:40 +0900)]
Doc: add description to pg_md5 man page how to show pool_passwd ready string.
Sometimes it is necessary to just show md5 hash string suitable for
pool_passwd, without adding an entry to pool_passwd.
Tatsuo Ishii [Sun, 26 May 2019 04:23:33 +0000 (13:23 +0900)]
Doc: add general description about failover.
Bo Peng [Thu, 23 May 2019 10:46:27 +0000 (19:46 +0900)]
Fix compile error on freebsd.
Add missing include file "netinet/in.h".
per bug519 and bug512.
Tatsuo Ishii [Wed, 22 May 2019 22:34:03 +0000 (07:34 +0900)]
Make failover in progress check more aggressively.
In pool_virtual_master_db_node_id() the case when session context is
not available was not covered by the failover in progress checking
because I thought it'd be too aggressive. However a report from field
showed that that could happen while authenticating a client (and it
causes a segfault). So I decided to move the check to beginning of the
function to cover the case.
Tatsuo Ishii [Wed, 22 May 2019 08:01:47 +0000 (17:01 +0900)]
Fix memory leak in outfuncs.c pointed out by Coverity.
Tatsuo Ishii [Wed, 22 May 2019 07:20:51 +0000 (16:20 +0900)]
Fix NULL pointer dereference pointed out by Coverity.
Tatsuo Ishii [Wed, 22 May 2019 07:29:57 +0000 (16:29 +0900)]
Revert "Fix memory leak pointed out by coverity."
This reverts commit
0100c6b0848d7b9ed41100698b746e4bad2a914e.
Tatsuo Ishii [Wed, 22 May 2019 06:15:37 +0000 (15:15 +0900)]
Fix memory leak pointed out by coverity.
Tatsuo Ishii [Wed, 22 May 2019 01:20:32 +0000 (10:20 +0900)]
Doc: fix mistake in the previous commit.
Follow master command's %P is "old primary node id" and should have
not been changed.
Tatsuo Ishii [Wed, 22 May 2019 00:48:29 +0000 (09:48 +0900)]
Doc: fix mistakenly described %P of failback command and follow master command.
These should have been "current primary node id", rather than "old
primary node id".
Tatsuo Ishii [Tue, 21 May 2019 22:39:37 +0000 (07:39 +0900)]
Deal pgpool_adm extension with PostgreSQL 12.
Now that oid is gone, the signature of CreateTemplateTupleDesc() has
been changed.
Bo Peng [Wed, 15 May 2019 06:57:42 +0000 (15:57 +0900)]
Prepare 3.7.10.
Bo Peng [Wed, 15 May 2019 06:51:09 +0000 (15:51 +0900)]
doc: Update docs version.
Bo Peng [Wed, 15 May 2019 06:42:47 +0000 (15:42 +0900)]
doc: Add release nots 3.4.24-3.7.10.
Bo Peng [Thu, 9 May 2019 08:22:29 +0000 (17:22 +0900)]
Fix the wrong error message "ERROR: connection cache is full", when all backend nodes are down.
When all backend nodes are down, Pgpool-II throws an uncorrect
error message "ERROR: connection cache is full". Change the error
message to "all backend nodes are down, pgpool requires at least one valid node".
per bug487.
https://www.pgpool.net/mantisbt/view.php?id=487
Tatsuo Ishii [Fri, 3 May 2019 23:26:55 +0000 (08:26 +0900)]
Doc: add useful link how to create pcp.conf in the pcp reference page.
Also fix some typos.
Tatsuo Ishii [Fri, 3 May 2019 00:02:29 +0000 (09:02 +0900)]
Speed up failover when all of backends are down.
Pgpool-II tries to find primary node till search_primary_node_timeout
expires even if all of the backend are in down status. This is not
only a waste of time but makes Pgpool-II looked like hanged because
while searching primary node failover process is suspended and all of
the Pgpool-II child process are in defunct state, thus there's no
process which accepts connection requests from clients. Since the
default value of searching primary is 300 seconds, typically this
keeps on for 300 seconds. This is not comfortable for users.
So immediately give up finding primary node regardless
search_primary_node_timeout and promptly finish the failover process
if all of the backend are in down status.
Discussion: https://www.pgpool.net/pipermail/pgpool-hackers/2019-May/003321.html
Tatsuo Ishii [Mon, 29 Apr 2019 23:49:48 +0000 (08:49 +0900)]
Deal with PostgreSQL 12.
recovery.conf cannot be used anymore. Standby's recovery configuration
is now in postgresql.conf. Also "standby.signal" file is needed in
PostgreSQL database cluster directory to start postmaster as a standby
server.
Tatsuo Ishii [Mon, 29 Apr 2019 23:46:06 +0000 (08:46 +0900)]
Deal with PostgreSQL 12.
HeapTupleGetOid() is not available any more in PostgreSQL 12. Use
GETSTRUCT() and refer to oid column of Form_pg_proc.
Takuma Hoshiai [Wed, 24 Apr 2019 02:32:50 +0000 (11:32 +0900)]
Remove .sgml file to not used.
basic-config-example.sgml written by English exists doc.ja directory only,
and don't used document.
Tatsuo Ishii [Sun, 21 Apr 2019 06:57:22 +0000 (15:57 +0900)]
Avoid exit/fork storm of pool_worker_child process.
pool_worker_child issues query to get WAL position using do_query(),
which could throws FATAL error. In this case pool_worker_child process
exits and Pgpool-II parent immediately forks new process. This cycle
indefinitely repeats and gives high load to the system.
This could easily happen. For example if ALWAYS_MASTER flag is
mistakenly set to standby node, it will cause an error:
ERROR: recovery is in progress
HINT: WAL control functions cannot be executed during recovery.
STATEMENT: SELECT pg_current_wal_lsn()
To avoid the exit/fork storm, sleep sr_check_period.
Tatsuo Ishii [Wed, 17 Apr 2019 22:52:56 +0000 (07:52 +0900)]
Fix black_function_list's broken default value.
I accidentally broke the entry of pgpool.conf.sample when
database_redirect_preference_list and
app_name_redirect_preference_list were introduced.
Also fix mistake of the entry of pgpool.conf.sample-replication as
well.
Issue reported by Sebastiaan Alexander Mannem.
Tatsuo Ishii [Wed, 17 Apr 2019 13:11:00 +0000 (22:11 +0900)]
Fix "not enough space in buffer" error.
The error occurred while processing error message returned from
backend and the cause is that the query string in question is too
big. Problem is, the buffer is in fixed size (8192 bytes). From the
programming point of view there's absolutely no need to use fixed size
buffer. So eliminate the fixed size buffer and use palloced buffer
instead. This also saves some memory copy work.
Per bug 499.
Tatsuo Ishii [Tue, 16 Apr 2019 06:48:44 +0000 (15:48 +0900)]
Fix DROP DATABASE failure.
When DROP DATABASE gets executed, SIGUSR1 is sent to the Pgpool-II
child process being issuing the command. In its SIGUSR1 handler,
MASTER macro is called while closing all idle connections. The MACRO
checks whether we are in failover process surely we are. As a result,
the process exits and DROP DATABASE command never been issued.
Per bug 486. However the reason of segfault in the report is not
clear. After commit:
https://git.postgresql.org/gitweb/?p=pgpool2.git;a=commit;h=
66b5aacfcc045ec1485921a5884b637fcfb6fd73
Things could be different. Let the user test the latest version in the
git repo and see if the problem is solved...
Tatsuo Ishii [Thu, 11 Apr 2019 08:32:19 +0000 (17:32 +0900)]
Doc: fix typo.
Takuma Hoshiai [Wed, 10 Apr 2019 07:04:22 +0000 (16:04 +0900)]
Doc: add restriction entry
master branch's commit(
ea1998b7350de6882bea25fc3634c4f7673adbde) backport to 3.6-4.0.
Takuma Hoshiai [Wed, 10 Apr 2019 02:55:45 +0000 (11:55 +0900)]
Fix to compare wrong variable, when old pgpool_status file read.
Pgpool-II 3.4 or later, pgpool_status format changed, and format both old and new is supported.
Pgpool might read status in file incorrectly, when old format is reading by Pgpool.
This is rare case, and noproblem if it is happend.
Tatsuo Ishii [Tue, 9 Apr 2019 05:17:57 +0000 (14:17 +0900)]
Doc: add description about multi-statement queries to restrictions section.
Bo Peng [Sun, 7 Apr 2019 15:01:50 +0000 (00:01 +0900)]
Add test/watchdog_setup to EXTRA_DIST.
See bug470: https://www.pgpool.net/mantisbt/view.php?id=470
Tatsuo Ishii [Sun, 7 Apr 2019 00:42:31 +0000 (09:42 +0900)]
Doc: mention that multi-statement queries are sent to primary node only.
Even if the multi-statement query includes SET command, they should be
sent to primary node only. This is not explicitly mentioned nowhere
in the doc.
Per bug 492.
Tatsuo Ishii [Wed, 3 Apr 2019 03:04:58 +0000 (12:04 +0900)]
Fix occasional regression test failure of 014.watchdog_test_quorum_bypass.
The test script does not retry psql while failover happens and
failed. So replace psql with wait_for_pgpool_startup.
Tatsuo Ishii [Tue, 2 Apr 2019 03:56:01 +0000 (12:56 +0900)]
Abort session if failover/failback is ongoing.
If failover/failback is ongoing, there would be a risk that MASTER
node macro cannot be used. If used, it could raise a segfault because
connection to the master node is NULL or bogus.
There are several reports suspected to be caused by this (see bug 481,
482 for example).
Now the guts of the MASTER* macro (pool_virtual_master_db_node_id())
is modified to check Req_info->switching which is true while
failover/failback is ongoing. If true, emit warning message and exit
the process. There's still a small window I know, but this should
greatly reduce the chance to access bogus MASTER connection without
using any locking.
Bo Peng [Tue, 2 Apr 2019 00:21:44 +0000 (09:21 +0900)]
Generate Makefile.in by automake 1.13.4.
Tatsuo Ishii [Sat, 30 Mar 2019 12:56:03 +0000 (21:56 +0900)]
Suppress useless truncation warnings from gcc 8+.
For this purpose update c-compiler.m4 (borrowed from PostgreSQL's
config/c-compiler.m4) and add PGAC_PROG_CC_VAR_OPT(NOT_THE_CFLAGS,
[-Wformat-truncation]) to configure.ac to generate -Wformat-truncation
compiler option.
Tatsuo Ishii [Sat, 30 Mar 2019 13:34:45 +0000 (22:34 +0900)]
Suppress "ar: `u' modifier ignored since `D' is the default (see `U')".
This is actually a bug with libtools. To deal with this, add ARFLAGS
to parser's Makefile.am.
Tatsuo Ishii [Sat, 30 Mar 2019 12:29:41 +0000 (21:29 +0900)]
Suppress compiler warnings.
Suppress compiler warnings regarding write(2) returns values being
ignored. Since they are used in signal handlers, it's impossible to
print info about errors. To shut up the warnings, create a static
variable and assign the return values from write().
Tatsuo Ishii [Sat, 30 Mar 2019 01:33:58 +0000 (10:33 +0900)]
Fix wrong usage of volatile declaration.
From a PostgreSQL commit message:
Variables used after a longjmp() need to be declared volatile. In
case of a pointer, it's the pointer itself that needs to be declared
volatile, not the pointed-to value.
Same thing can be said to:
volatile StartupPacket *sp;
This should have been:
StartupPacket *volatile sp;
This also suppresses a compiler warning.
Tatsuo Ishii [Thu, 28 Mar 2019 04:58:27 +0000 (13:58 +0900)]
Fix memory leak in "batch" mode in extended query.
In "batch" mode, not for every execute message, a sync message is
followed. Unfortunately Pgpool-II only discard memory of query
context for the last execute message while processing the ready for
query message. For example if 3 execute messages are sent before the
sync message, 2 of query context memory will not be freed and this
leads to serious memory leak.
To fix the problem, now the query context memory is possibly discarded
when a command complete message is returned from backend if the query
context is not referenced either by sent messages or pending messages.
If it is not referenced at all, we can discard the query context.
Also even if it is referenced, it is ok to discard the query context
if it is either an unnamed statement or an unnamed portal because it
will be discarded anyway when next unnamed statement or portal is
created.
Per bug 468.