Tatsuo Ishii [Fri, 9 Aug 2019 08:04:28 +0000 (17:04 +0900)]
Fix "unable to bind. cannot get parse message" error.
This was caused by too-eager memory free in parse_before_bind. It
called
pool_remove_sent_message/pool_create_sent_message/pool_add_sent_message
combo to replace the query context in the sent message. Unfortunately
pool_remove_sent_message free memory such as statement name, which was
being passed by caller. As a result, the new sent message created by
pool_create_sent_message pointed to freed statement name, which may
make a search by statement name fail because now the statement name in
the sent message points to freed memory area, which might be
overwritten by later memory allocation. Fix is, instead of calling
pool_remove_sent_message etc., just replace the query context in the
sent message.
Per bug 531.
Muhammad Usama [Thu, 8 Aug 2019 14:03:40 +0000 (19:03 +0500)]
Fix for
0000483: online-recovery is blocked after a child process exits ...
The problem is if some child process exits abnormally during the second stage
of online recovery, then the connection counter that keeps the track of exiting
processes does not get decremented and Pgpool-II keeps waiting for the exit of
the already exited process. Eventually, the recovery fails after
client_idle_limit_in_recovery expires.
The fix for this issue is to set the connection counter to zero when
client_idle_limit_in_recovery is enabled and it has less value than
recovery_timeout, Since all clients must have been kicked out by the time
when client_idle_limit_in_recovery expires.
A similar fix is already committed as part of bug 431 by Tatsuo Ishii, So this
commit basically imports the same logic in the watchdog function that processes
the remote online recovery requests.
Apart from the above-mentioned change, Hoshiai San identified that the watchdog
IPC command timeout for the online recovery start function executed through watchdog
is set exactly to the same as recovery_timeout which needs to be increased to
make the solution work correctly.
Muhammad Usama [Wed, 15 May 2019 21:36:35 +0000 (02:36 +0500)]
Fix for [pgpool-hackers: 3295] duplicate failover request ...
Pgpool should keep the backend health check running on quarantined nodes so
that when the connectivity resumes, they should automatically get removed
from the quarantine. Otherwise the temporary network glitch could send the node
into permanent quarantine state.
Muhammad Usama [Wed, 7 Aug 2019 15:22:01 +0000 (20:22 +0500)]
Fix for no primary on standby pgpool when primary is quarantined on master
Master watchdog Pgpool sends primary_node_id = -1 in the backend status sync
message if the primary node is quarantined on it. So standby watchdog Pgpool
must not update its primary_node_id if the primary backend node id in sync
message is invalid_node_id (-1) while the same sync message reports the
backend status of the current primary node as "NOT DOWN".
The issue was reported by "Tatsuo Ishii <ishii@sraoss.co.jp>" and fixed by me
Tatsuo Ishii [Thu, 8 Aug 2019 02:02:50 +0000 (11:02 +0900)]
Import some of memory manager debug facilities from PostgreSQL.
Now we can use CLOBBER_FREED_MEMORY, which is useful to detect
accesses to already pfreed memory.
Takuma Hoshiai [Mon, 29 Jul 2019 06:08:53 +0000 (15:08 +0900)]
Fix watchdog_setup command option
The mode option is incorrectly. when pgpool_setup command is called by
watchdog_setup command, mode option forget to set.
Tatsuo Ishii [Sun, 28 Jul 2019 02:11:07 +0000 (11:11 +0900)]
Fix pgpool_setup to produce correct follow master command.
The produced script incorrectly checked whether PostgreSQL is running
or not, which resulted in that it mistakenly thought PostgreSQL is
always running.
Bo Peng [Thu, 25 Jul 2019 00:19:53 +0000 (09:19 +0900)]
Fix regression test errors.
Bo Peng [Wed, 24 Jul 2019 12:19:26 +0000 (21:19 +0900)]
Use pg_get_expr() instead of pg_attrdef.adsrc to support for PostgreSQL 12.
Since PostgreSQL 12 removed pg_attrdef.adsrc, use pg_get_expr() instead of pg_attrdef.adsrc if the backend version is 7.3 or later.
Thanks to Takuma Hoshiai for creating the patch.
Tatsuo Ishii [Wed, 17 Jul 2019 07:51:31 +0000 (16:51 +0900)]
Fix the failover() so that it does not access out of array.
Per Coverity.
Tatsuo Ishii [Wed, 17 Jul 2019 07:48:37 +0000 (16:48 +0900)]
Enhance shutdown script of pgpool_setup.
I observe occasional regression test failure caused by bind error to
the TCP/IP port. This fix tries to confirm usage of the TCP/IP port
while executing shutdown script using netstat command.
Tatsuo Ishii [Tue, 16 Jul 2019 06:21:10 +0000 (15:21 +0900)]
Backport Pgversion().
Tatsuo Ishii [Sun, 7 Jul 2019 13:58:35 +0000 (22:58 +0900)]
Fix possible out of array index access.
It was pointed out by Coverity that node_id could be -1.
Tatsuo Ishii [Sun, 7 Jul 2019 01:09:25 +0000 (10:09 +0900)]
Fix query cache module so that it checks oid array's bound.
Tatsuo Ishii [Sat, 6 Jul 2019 23:08:25 +0000 (08:08 +0900)]
Fix off-by-one error in query cache module.
When debug print is enabled, it might had tried to access out of bound
of oid array.
Tatsuo Ishii [Fri, 5 Jul 2019 05:32:43 +0000 (14:32 +0900)]
Allow health check process to reload pgpool.conf.
When separate health check process was introduced, we forgot to send
signal to the health check process when pgpool.conf reload is
requested.
Tatsuo Ishii [Wed, 3 Jul 2019 03:59:11 +0000 (12:59 +0900)]
Bug525: Fix sefault when query cache is enabled.
When query cache is enabled,
session_context->query_context->skip_cache_commit flag was set or
reset while processing execute message. Problem was, it was done
before session_context->query_context was set. So fix is just
set/reset
query_context->skip_cache_commit. session_context->query_context is
set later on anyway.
Per bug 525.
Tatsuo Ishii [Tue, 2 Jul 2019 09:40:11 +0000 (18:40 +0900)]
Make shutdownall to wait for completion of shutdown of Pgpool-II.
It was observed that regression test occasionally failed because
previous does not completely finished before next test started. To fix
the problem, make shutdownall script generated by pgpool_setup to wait
for completion of shutdown of Pgpool-II.
Tatsuo Ishii [Tue, 2 Jul 2019 00:08:57 +0000 (09:08 +0900)]
Down grade LOG to DEBUG5 in sent message module.
The log was added in commit
56a6b6a72, but in some cases it is
disturbing users.
Discussion: [pgpool-general: 6620] Fwd: A lot of "checking zapping sent message" in log
Tatsuo Ishii [Mon, 24 Jun 2019 13:13:18 +0000 (22:13 +0900)]
Fix mistake introduced in the previous commit.
Tatsuo Ishii [Mon, 24 Jun 2019 01:57:34 +0000 (10:57 +0900)]
Fix segfault when "samenet" is specified in pool_hba.conf.
When "samenet" is specified, SockAddr_cidr_mask(struct
sockaddr_storage *mask, char *numbits, int family) gets called with
numbits == NULL. However the function was not prepared for
it. Originally the function was imported from PostgreSQL. When the bug
was fixed in PostgreSQL, unfortunately the fix was not applied to
Pgpool-II. This commit applies the same fix as PostgreSQL.
Discussion: [pgpool-general: 6601] Pgpool-II + hba + samenet = segfault in libc-2.24.so
Bo Peng [Thu, 20 Jun 2019 03:53:40 +0000 (12:53 +0900)]
doc: Fix documentation typos.
Bo Peng [Wed, 19 Jun 2019 02:04:02 +0000 (11:04 +0900)]
doc: Fix documentation typo.
Bo Peng [Wed, 19 Jun 2019 01:51:46 +0000 (10:51 +0900)]
doc: Fix documentation errors in follow_sh script.
Tatsuo Ishii [Tue, 11 Jun 2019 05:10:47 +0000 (14:10 +0900)]
Merge branch 'V3_7_STABLE' of ssh://git.postgresql.org/pgpool2 into V3_7_STABLE
Tatsuo Ishii [Tue, 11 Jun 2019 04:47:42 +0000 (13:47 +0900)]
Fix health check process is not shutting down in certain cases.
When watchdog detects fatal events, including not reaching to
trusted_servers, watchdog suicides with POOL_EXIT_FATAL exit status
code. In this case the parent of watchdog, the pgpool main process's
SIGCHILD handler reaper() exits and on_exit call back calls
system_will_go_down(), which in turn calls terminate_all_children().
Problem is, terminate_all_children() forgot to kill health check
process. This commit fixes that.
Also there are some not well behaving codings are enhanced.
Back patched to 3.7, when the bug was introduced.
Bo Peng [Fri, 7 Jun 2019 08:19:37 +0000 (17:19 +0900)]
Fix to deal with backslashes according to the config of standard_conforming_strings
in native replication mode.
per bug467.
Tatsuo Ishii [Sun, 2 Jun 2019 02:40:40 +0000 (11:40 +0900)]
Doc: add description to pg_md5 man page how to show pool_passwd ready string.
Sometimes it is necessary to just show md5 hash string suitable for
pool_passwd, without adding an entry to pool_passwd.
Tatsuo Ishii [Sun, 26 May 2019 04:23:33 +0000 (13:23 +0900)]
Doc: add general description about failover.
Bo Peng [Thu, 23 May 2019 10:46:27 +0000 (19:46 +0900)]
Fix compile error on freebsd.
Add missing include file "netinet/in.h".
per bug519 and bug512.
Tatsuo Ishii [Wed, 22 May 2019 22:34:03 +0000 (07:34 +0900)]
Make failover in progress check more aggressively.
In pool_virtual_master_db_node_id() the case when session context is
not available was not covered by the failover in progress checking
because I thought it'd be too aggressive. However a report from field
showed that that could happen while authenticating a client (and it
causes a segfault). So I decided to move the check to beginning of the
function to cover the case.
Tatsuo Ishii [Wed, 22 May 2019 08:01:47 +0000 (17:01 +0900)]
Fix memory leak in outfuncs.c pointed out by Coverity.
Tatsuo Ishii [Wed, 22 May 2019 07:20:51 +0000 (16:20 +0900)]
Fix NULL pointer dereference pointed out by Coverity.
Tatsuo Ishii [Wed, 22 May 2019 07:29:57 +0000 (16:29 +0900)]
Revert "Fix memory leak pointed out by coverity."
This reverts commit
0100c6b0848d7b9ed41100698b746e4bad2a914e.
Tatsuo Ishii [Wed, 22 May 2019 06:15:37 +0000 (15:15 +0900)]
Fix memory leak pointed out by coverity.
Tatsuo Ishii [Wed, 22 May 2019 01:20:32 +0000 (10:20 +0900)]
Doc: fix mistake in the previous commit.
Follow master command's %P is "old primary node id" and should have
not been changed.
Tatsuo Ishii [Wed, 22 May 2019 00:48:29 +0000 (09:48 +0900)]
Doc: fix mistakenly described %P of failback command and follow master command.
These should have been "current primary node id", rather than "old
primary node id".
Tatsuo Ishii [Tue, 21 May 2019 22:39:37 +0000 (07:39 +0900)]
Deal pgpool_adm extension with PostgreSQL 12.
Now that oid is gone, the signature of CreateTemplateTupleDesc() has
been changed.
Bo Peng [Wed, 15 May 2019 06:57:42 +0000 (15:57 +0900)]
Prepare 3.7.10.
Bo Peng [Wed, 15 May 2019 06:51:09 +0000 (15:51 +0900)]
doc: Update docs version.
Bo Peng [Wed, 15 May 2019 06:42:47 +0000 (15:42 +0900)]
doc: Add release nots 3.4.24-3.7.10.
Bo Peng [Thu, 9 May 2019 08:22:29 +0000 (17:22 +0900)]
Fix the wrong error message "ERROR: connection cache is full", when all backend nodes are down.
When all backend nodes are down, Pgpool-II throws an uncorrect
error message "ERROR: connection cache is full". Change the error
message to "all backend nodes are down, pgpool requires at least one valid node".
per bug487.
https://www.pgpool.net/mantisbt/view.php?id=487
Tatsuo Ishii [Fri, 3 May 2019 23:26:55 +0000 (08:26 +0900)]
Doc: add useful link how to create pcp.conf in the pcp reference page.
Also fix some typos.
Tatsuo Ishii [Fri, 3 May 2019 00:02:29 +0000 (09:02 +0900)]
Speed up failover when all of backends are down.
Pgpool-II tries to find primary node till search_primary_node_timeout
expires even if all of the backend are in down status. This is not
only a waste of time but makes Pgpool-II looked like hanged because
while searching primary node failover process is suspended and all of
the Pgpool-II child process are in defunct state, thus there's no
process which accepts connection requests from clients. Since the
default value of searching primary is 300 seconds, typically this
keeps on for 300 seconds. This is not comfortable for users.
So immediately give up finding primary node regardless
search_primary_node_timeout and promptly finish the failover process
if all of the backend are in down status.
Discussion: https://www.pgpool.net/pipermail/pgpool-hackers/2019-May/003321.html
Tatsuo Ishii [Mon, 29 Apr 2019 23:49:48 +0000 (08:49 +0900)]
Deal with PostgreSQL 12.
recovery.conf cannot be used anymore. Standby's recovery configuration
is now in postgresql.conf. Also "standby.signal" file is needed in
PostgreSQL database cluster directory to start postmaster as a standby
server.
Tatsuo Ishii [Mon, 29 Apr 2019 23:46:06 +0000 (08:46 +0900)]
Deal with PostgreSQL 12.
HeapTupleGetOid() is not available any more in PostgreSQL 12. Use
GETSTRUCT() and refer to oid column of Form_pg_proc.
Takuma Hoshiai [Wed, 24 Apr 2019 02:32:50 +0000 (11:32 +0900)]
Remove .sgml file to not used.
basic-config-example.sgml written by English exists doc.ja directory only,
and don't used document.
Tatsuo Ishii [Sun, 21 Apr 2019 06:57:22 +0000 (15:57 +0900)]
Avoid exit/fork storm of pool_worker_child process.
pool_worker_child issues query to get WAL position using do_query(),
which could throws FATAL error. In this case pool_worker_child process
exits and Pgpool-II parent immediately forks new process. This cycle
indefinitely repeats and gives high load to the system.
This could easily happen. For example if ALWAYS_MASTER flag is
mistakenly set to standby node, it will cause an error:
ERROR: recovery is in progress
HINT: WAL control functions cannot be executed during recovery.
STATEMENT: SELECT pg_current_wal_lsn()
To avoid the exit/fork storm, sleep sr_check_period.
Tatsuo Ishii [Wed, 17 Apr 2019 22:52:56 +0000 (07:52 +0900)]
Fix black_function_list's broken default value.
I accidentally broke the entry of pgpool.conf.sample when
database_redirect_preference_list and
app_name_redirect_preference_list were introduced.
Also fix mistake of the entry of pgpool.conf.sample-replication as
well.
Issue reported by Sebastiaan Alexander Mannem.
Tatsuo Ishii [Wed, 17 Apr 2019 13:11:00 +0000 (22:11 +0900)]
Fix "not enough space in buffer" error.
The error occurred while processing error message returned from
backend and the cause is that the query string in question is too
big. Problem is, the buffer is in fixed size (8192 bytes). From the
programming point of view there's absolutely no need to use fixed size
buffer. So eliminate the fixed size buffer and use palloced buffer
instead. This also saves some memory copy work.
Per bug 499.
Tatsuo Ishii [Tue, 16 Apr 2019 06:48:44 +0000 (15:48 +0900)]
Fix DROP DATABASE failure.
When DROP DATABASE gets executed, SIGUSR1 is sent to the Pgpool-II
child process being issuing the command. In its SIGUSR1 handler,
MASTER macro is called while closing all idle connections. The MACRO
checks whether we are in failover process surely we are. As a result,
the process exits and DROP DATABASE command never been issued.
Per bug 486. However the reason of segfault in the report is not
clear. After commit:
https://git.postgresql.org/gitweb/?p=pgpool2.git;a=commit;h=
66b5aacfcc045ec1485921a5884b637fcfb6fd73
Things could be different. Let the user test the latest version in the
git repo and see if the problem is solved...
Tatsuo Ishii [Thu, 11 Apr 2019 08:32:19 +0000 (17:32 +0900)]
Doc: fix typo.
Takuma Hoshiai [Wed, 10 Apr 2019 07:04:22 +0000 (16:04 +0900)]
Doc: add restriction entry
master branch's commit(
ea1998b7350de6882bea25fc3634c4f7673adbde) backport to 3.6-4.0.
Takuma Hoshiai [Wed, 10 Apr 2019 02:55:45 +0000 (11:55 +0900)]
Fix to compare wrong variable, when old pgpool_status file read.
Pgpool-II 3.4 or later, pgpool_status format changed, and format both old and new is supported.
Pgpool might read status in file incorrectly, when old format is reading by Pgpool.
This is rare case, and noproblem if it is happend.
Tatsuo Ishii [Tue, 9 Apr 2019 05:17:57 +0000 (14:17 +0900)]
Doc: add description about multi-statement queries to restrictions section.
Bo Peng [Sun, 7 Apr 2019 15:01:50 +0000 (00:01 +0900)]
Add test/watchdog_setup to EXTRA_DIST.
See bug470: https://www.pgpool.net/mantisbt/view.php?id=470
Tatsuo Ishii [Sun, 7 Apr 2019 00:42:31 +0000 (09:42 +0900)]
Doc: mention that multi-statement queries are sent to primary node only.
Even if the multi-statement query includes SET command, they should be
sent to primary node only. This is not explicitly mentioned nowhere
in the doc.
Per bug 492.
Tatsuo Ishii [Wed, 3 Apr 2019 03:04:58 +0000 (12:04 +0900)]
Fix occasional regression test failure of 014.watchdog_test_quorum_bypass.
The test script does not retry psql while failover happens and
failed. So replace psql with wait_for_pgpool_startup.
Tatsuo Ishii [Tue, 2 Apr 2019 03:56:01 +0000 (12:56 +0900)]
Abort session if failover/failback is ongoing.
If failover/failback is ongoing, there would be a risk that MASTER
node macro cannot be used. If used, it could raise a segfault because
connection to the master node is NULL or bogus.
There are several reports suspected to be caused by this (see bug 481,
482 for example).
Now the guts of the MASTER* macro (pool_virtual_master_db_node_id())
is modified to check Req_info->switching which is true while
failover/failback is ongoing. If true, emit warning message and exit
the process. There's still a small window I know, but this should
greatly reduce the chance to access bogus MASTER connection without
using any locking.
Bo Peng [Tue, 2 Apr 2019 00:21:44 +0000 (09:21 +0900)]
Generate Makefile.in by automake 1.13.4.
Tatsuo Ishii [Sat, 30 Mar 2019 12:56:03 +0000 (21:56 +0900)]
Suppress useless truncation warnings from gcc 8+.
For this purpose update c-compiler.m4 (borrowed from PostgreSQL's
config/c-compiler.m4) and add PGAC_PROG_CC_VAR_OPT(NOT_THE_CFLAGS,
[-Wformat-truncation]) to configure.ac to generate -Wformat-truncation
compiler option.
Tatsuo Ishii [Sat, 30 Mar 2019 13:34:45 +0000 (22:34 +0900)]
Suppress "ar: `u' modifier ignored since `D' is the default (see `U')".
This is actually a bug with libtools. To deal with this, add ARFLAGS
to parser's Makefile.am.
Tatsuo Ishii [Sat, 30 Mar 2019 12:29:41 +0000 (21:29 +0900)]
Suppress compiler warnings.
Suppress compiler warnings regarding write(2) returns values being
ignored. Since they are used in signal handlers, it's impossible to
print info about errors. To shut up the warnings, create a static
variable and assign the return values from write().
Tatsuo Ishii [Sat, 30 Mar 2019 01:33:58 +0000 (10:33 +0900)]
Fix wrong usage of volatile declaration.
From a PostgreSQL commit message:
Variables used after a longjmp() need to be declared volatile. In
case of a pointer, it's the pointer itself that needs to be declared
volatile, not the pointed-to value.
Same thing can be said to:
volatile StartupPacket *sp;
This should have been:
StartupPacket *volatile sp;
This also suppresses a compiler warning.
Tatsuo Ishii [Thu, 28 Mar 2019 04:58:27 +0000 (13:58 +0900)]
Fix memory leak in "batch" mode in extended query.
In "batch" mode, not for every execute message, a sync message is
followed. Unfortunately Pgpool-II only discard memory of query
context for the last execute message while processing the ready for
query message. For example if 3 execute messages are sent before the
sync message, 2 of query context memory will not be freed and this
leads to serious memory leak.
To fix the problem, now the query context memory is possibly discarded
when a command complete message is returned from backend if the query
context is not referenced either by sent messages or pending messages.
If it is not referenced at all, we can discard the query context.
Also even if it is referenced, it is ok to discard the query context
if it is either an unnamed statement or an unnamed portal because it
will be discarded anyway when next unnamed statement or portal is
created.
Per bug 468.
Bo Peng [Thu, 28 Mar 2019 11:56:35 +0000 (20:56 +0900)]
Change pgpool.spec.
Bo Peng [Thu, 28 Mar 2019 11:29:36 +0000 (20:29 +0900)]
Update pgpool_socket_dir.patch file.
Bo Peng [Thu, 28 Mar 2019 09:30:54 +0000 (18:30 +0900)]
Prepare 3.7.9.
Bo Peng [Thu, 28 Mar 2019 09:24:32 +0000 (18:24 +0900)]
Doc: Update docs version.
Bo Peng [Thu, 28 Mar 2019 09:18:10 +0000 (18:18 +0900)]
Doc: Add release-notes 4.0.4-3.4.23.
Conflicts:
doc.ja/src/sgml/release-4.0.sgml
doc/src/sgml/release-4.0.sgml
Tatsuo Ishii [Wed, 27 Mar 2019 10:16:49 +0000 (19:16 +0900)]
Doc: add ssl_prefer_server_ciphers paramter to Japanese doc.
Muhammad Usama [Wed, 27 Mar 2019 07:51:20 +0000 (12:51 +0500)]
Add new configuration option ssl_prefer_server_ciphers
Add the new setting "ssl_prefer_server_ciphers" to let users configure if they
want client's or server's cipher order to take preference.
Yugo Nagata [Wed, 27 Mar 2019 01:08:32 +0000 (10:08 +0900)]
Specify default value of ssl_ciphers
Tatsuo Ishii [Sat, 23 Mar 2019 04:04:21 +0000 (13:04 +0900)]
Allow to set a client cipher list.
For this purpose new parameter "ssl_ciphers" is added. This is already
implemented in PostgreSQL and useful to enhance security when SSL is
enabled.
Tatsuo Ishii [Mon, 18 Mar 2019 00:45:51 +0000 (09:45 +0900)]
Fix unnecessary fsync to pgpool_status file.
Whenever new connections are created to PostgreSQL backend, fsync()
was issued to pgpool_status file, which could generate excessive I/O
in certain conditions, for example num_init_children is large and
connections to backend have certain life time limit.
So reduce the chance of issuing fsync() so that it is issued only when
backend status is changed from CON_CONNECT_WAIT or others to CON_UP.
If the status is already CON_UP, we don't need to write to
pgpool_status.
Discussion: [pgpool-general: 6436] High I/O Usage on PGPool nodes
Bo Peng [Thu, 14 Mar 2019 05:21:59 +0000 (14:21 +0900)]
Add "tags" to gitignore file.
Bo Peng [Thu, 7 Mar 2019 02:26:17 +0000 (11:26 +0900)]
Fix some mistakes from previous commit.
Bo Peng [Thu, 7 Mar 2019 01:27:31 +0000 (10:27 +0900)]
Fix indent of pgpool.conf sample files.
Tatsuo Ishii [Wed, 27 Feb 2019 00:38:15 +0000 (09:38 +0900)]
Fix write_status_file()'s signature.
It was mistakenly declared as write_status_file(). Of course this
should be: write_status_file(void).
Bo Peng [Thu, 21 Feb 2019 01:01:05 +0000 (10:01 +0900)]
Prepare 3.7.8.
Bo Peng [Wed, 20 Feb 2019 11:04:56 +0000 (20:04 +0900)]
doc: update doc version.
Bo Peng [Wed, 20 Feb 2019 10:55:21 +0000 (19:55 +0900)]
Add release-notes 3.7.8-3.4.22.
Takuma Hoshiai [Fri, 15 Feb 2019 07:22:06 +0000 (16:22 +0900)]
Fix regression test 068
It was not working correctly, because a function of old jdbc and some fixed variable were used by this test case.
Tatsuo Ishii [Fri, 15 Feb 2019 05:26:55 +0000 (14:26 +0900)]
Fix configuration change timing regarding memory_cache_enabled.
This parameter must not be changed after Pgpool-II start but it was
possible to change by reloading.
Tatsuo Ishii [Tue, 12 Feb 2019 07:59:35 +0000 (16:59 +0900)]
Fix unwanted recovery timeout in certain cases.
In the second stage of online recovery in replication mode, it is
possible it fails with timeout (message: "wait_connection_closed:
existing connections did not close in %d sec.") if connection counter
is malformed by a child process aborts with SIGKILL, SEGFAULT or etc.
This could be detected by checking if client_idle_limit_in_recovery is
enabled and it has less value than recovery_timeout because all
clients must be kicked out by the time when
client_idle_limit_in_recovery is expired. If so, we should reset
conn_counter to 0 also.
Per bug 431.
Tatsuo Ishii [Tue, 5 Feb 2019 22:32:19 +0000 (07:32 +0900)]
Fix merge conflict in previous commit.
Tatsuo Ishii [Tue, 5 Feb 2019 11:59:38 +0000 (20:59 +0900)]
Reduce memory usage when large data set is returned from backend.
In commit
8640abfc41ff06b1e6d31315239292f4d3d4191d,
pool_wait_till_ready_for_query() was introduced to retrieve all
messages into buffer from backend until it found a "ready for query"
message when extended query protocol is used in streaming replication
mode. It could hit memory allocation limit of palloc(), which is 1GB.
This could be easily reproduced by using pgbench and pgproto for
example.
pgbench -s 100
pgproto data:
'P' "" "SELECT * FROM pgbench_accounts" 0
'B' "" "" 0 0 0
'E' "" 0
'S'
'Y'
To reduce the memory usage, introduce "suspend_reading_from_frontend"
flag in session context so that Pgpool-II does not read any message
after sync message is received. The flag is turned off when a "ready
for query" message is received from backend. Between this, Pgpool-II
reads messages from backend and forward to frontend as usual. This way
we could eliminate the necessity to store messages from backend in
buffer, thus it reduces the memory foot print.
Per bug 462.
Tatsuo Ishii [Tue, 5 Feb 2019 09:51:40 +0000 (18:51 +0900)]
Fix syntax error in extended query test script.
Checking "Some process remains" needed double quotes around
a variable.
Tatsuo Ishii [Tue, 29 Jan 2019 08:20:41 +0000 (17:20 +0900)]
Fix corner case bug with strip_quote().
strip_quote(), which is called by pattern_compare() did not properly
handle empty query string case. In the worst case it could wipe out
memory after a pointer returned from malloc(), which could cause a
segmentation fault in free() called in pattern_compare().
Per bug 458.
Tatsuo Ishii [Sun, 27 Jan 2019 02:03:14 +0000 (11:03 +0900)]
Mention that schema qualifications cannot be used in white/black_function_list.
Takuma Hoshiai [Wed, 23 Jan 2019 00:28:43 +0000 (09:28 +0900)]
Fix typo about wd_priority in watchdog_setup.
Tatsuo Ishii [Thu, 10 Jan 2019 03:20:07 +0000 (12:20 +0900)]
Fix Pgpool child segfault in a race condition.
1) frontend tries to connect to Pgpool-II
2) there's no existing connection cache
3) try to create new backend connections by calling connect_backend()
4) inside connect_backend(), pool_create_cp() gets called
5) pool_create_cp() calls new_connection()
6) failover occurs and the global backend status is set to down, but
the pgpool main does not send kill signal to the child process yet
7) inside new_connection() after checking VALID_BACKEND, it checks the
global backend status and finds it is set to down status, so that
it returns without creating new connection slot
8) connect_backend() continues and accesses the down connection slot
because local status says it's alive, which results in a segfault.
Since there's already checking for the global status in
new_connection(), a fix could be syncing the local status with the
global status there.
See [pgpool-hackers: 3214] for discussion.
Tatsuo Ishii [Thu, 3 Jan 2019 08:30:04 +0000 (17:30 +0900)]
Doc: fix typo in logdir description.
Per bug 453.
Tatsuo Ishii [Tue, 11 Dec 2018 22:42:31 +0000 (07:42 +0900)]
Fix occasional extended query hang.
If a client sends a extended query message such as close after sync
message but before next simple query, Pgpool-II could hang.
:
<= BE ParseComplete
<= BE BindComplete
<= BE CommandComplete(COMMIT)
<= BE ReadyForQuery(I)
FE=> Close(stmt="S2")
FE=> Close(stmt="S1")
FE=> Query (query="BEGIN") [0]
<= BE CloseComplete [1]
<= BE CloseComplete [1]
<= BE CommandComplete(BEGIN) [2]
Because of [1], query in progress flag was reset, then [2] hangs
trying to read from backend which did not receive message from
Pgpool-II because it does not refer to the query context set by [0].
Sending close after sync is not recommended according to the official
document but some sloppy drivers seem to do it. To deal with the
problem, check the doing extended query message flag before resetting
the query in progress flag.
Problem reported by Muhammad Usama.
Discussion: https://www.pgpool.net/pipermail/pgpool-hackers/2018-December/003164.html
Tatsuo Ishii [Thu, 6 Dec 2018 08:20:32 +0000 (17:20 +0900)]
Deal with "terminating connection due to idle-in-transaction timeout" error.
If idle_in_transaction_session_timeout parameter is set to reasonably
short in postgresql.conf, the fatal error easily occurs and the
connection from Pgpool-II to backend is terminated. This leads to
Pgpool-II either hang (if only one of PostgreSQL equips equips the
parameter) or unwanted failover (if all PostgreSQL equips with the
parameter), and both are not good. So intercept the message and send
the same message to frontend then exit to terminate the connection to
frontend. This is similar treatment as the error "connection was
terminated due to conflict with recovery, User was holding a relation
lock for too long."
Per bug 448.
Bo Peng [Wed, 5 Dec 2018 05:55:55 +0000 (14:55 +0900)]
doc: Fix Japanese document typo in pcp_common_options.
Bo Peng [Wed, 21 Nov 2018 08:44:33 +0000 (17:44 +0900)]
Prepare 3.7.7.
Takuma Hoshiai [Wed, 21 Nov 2018 08:32:38 +0000 (17:32 +0900)]
Change sort algorism buble sort to quick sort.
This is used to sort startup packet's parameters.
Bo Peng [Wed, 21 Nov 2018 08:22:23 +0000 (17:22 +0900)]
Add release notes.
Takuma Hoshiai [Wed, 21 Nov 2018 02:45:17 +0000 (11:45 +0900)]
Fix to sort startup packet's parameters sent by client.
If order of startup packet's parameters differ between cached connection pools and connection request, did't use connection pool ,and created new connection pool.
Per bug 444.