Andres Freund [Mon, 18 Aug 2014 13:41:05 +0000 (15:41 +0200)]
bdr: Consider a sequence vote successfull in more scenarios.
If either there's no nays or the number of yays is bigger than the
concensus the vote can be considered successful. We don't want to
always wait for a majority of yays, because that'd possibly delay
voting for a long time, even if enough nodes for a consensus on
another chunk are online.
Andres Freund [Mon, 18 Aug 2014 13:16:43 +0000 (15:16 +0200)]
bdr: Limit the amount of time spent in voting.
Don't endlessly vote if there's votes to be done. As voting happens
inside a single xact that'll just lead to large transactions that the
other nodes can't see yet.
Andres Freund [Mon, 18 Aug 2014 11:09:11 +0000 (13:09 +0200)]
bdr: Vote on several sequences at once to speed up the voting process.
We can only vote on elections started by a single node at a time
without further complications. The problem is that otherwise two votes
could conflict which isn't trivial to detect in a query. Select one
node with open elections and vote on up to 1000 elections started by
that node.
Andres Freund [Mon, 18 Aug 2014 11:07:19 +0000 (13:07 +0200)]
bdr: Add indexes to make sequence voting faster.
Andres Freund [Mon, 18 Aug 2014 11:06:37 +0000 (13:06 +0200)]
bdr: Fix type of bdr_votes.dboid column to oid from bigint
This prevented index usage in at least one query.
Petr Jelinek [Sun, 17 Aug 2014 14:36:22 +0000 (16:36 +0200)]
bdr: add conflict logging tests
New isolation tests for simple insert_insert, update_update,
update_delete and delete_delete resolving/logging tests.
In passing make the isolation test initialization more sane.
Petr Jelinek [Sun, 17 Aug 2014 14:28:57 +0000 (16:28 +0200)]
bdr: make conflict logging behave consistently
Conflict logging was mess, some of the conflicts were logged to table,
some to server log, some as DEBUG, some as LOG, the log messages looked
completely differently for each conflict, etc.
This patch introduces single interface for logging into server log which
is always called and has common log line template and handles all
current case.
Other change is that all conflicts can now be logged to the log table
(previously only insert_insert conflicts could be logged there).
This is step1, not the final code, but it brings enough improvement that
it's worth committing as is.
Petr Jelinek [Sun, 17 Aug 2014 14:27:29 +0000 (16:27 +0200)]
bdr: add delete_delete conflict type to bdr_conflict_type.
Bump BDR extension version accordingly and when at it bump the BDR version too.
Petr Jelinek [Sun, 17 Aug 2014 11:59:41 +0000 (13:59 +0200)]
bdr: add minimum bdr version to the output plugin options
This enables BDR upgrades without dump/reload in the future.
Note that this change breaks compatibility with any previous version.
Petr Jelinek [Fri, 15 Aug 2014 14:46:37 +0000 (16:46 +0200)]
bdr: tests for ddl lock conflict
Andres Freund [Fri, 15 Aug 2014 14:33:56 +0000 (16:33 +0200)]
bdr: Improve concurrency and error recovery of ddl locking.
Previously there were three major issues:
* If we failed acquiring the lock on some, but not all nodes we didn't
always send out a message forcing the lock to be released again on
the succeeding node.
* There were some scenarios in which a lock was held only
intermittenly in a way that allowed the lock's state to change in
unhandled ways.
* Two backends within a single node didn't really protect against each
other requiring the ddl lock. That's not really problematic in itself
because locally there's plain relation locking to take care of things,
but two concurrent lock acquisitions could cause problems.
Now we'll always send lock release messages if we fail during the
acquisition of a lock. That means that nodes need to be prepared to
deal with the fact that a lock is released that's not held.
RT-#37905
Andres Freund [Fri, 15 Aug 2014 14:26:11 +0000 (16:26 +0200)]
bdr: Improve log messages about ddl locking.
A previous commit changed the capitalization wrongly. Also add a bit
more detail about which nodes are holding the lock to some error
messages.
More work is needed, but it's a clear improvement already.
Petr Jelinek [Fri, 15 Aug 2014 09:16:12 +0000 (11:16 +0200)]
bdr: add GRANT/REVOKE tests
Craig Ringer [Fri, 15 Aug 2014 03:16:46 +0000 (11:16 +0800)]
bdr: Emit informative error messages for global sequence failures
Related to bug #37914
Craig Ringer [Fri, 15 Aug 2014 03:02:14 +0000 (11:02 +0800)]
bdr: Minimalist error message notes and link to style guide
Craig Ringer [Fri, 15 Aug 2014 02:23:18 +0000 (10:23 +0800)]
bdr: Avoid date --iso-8601 , OS X doesn't like it
Report per Ian Barwick
Petr Jelinek [Tue, 12 Aug 2014 21:31:21 +0000 (23:31 +0200)]
bdr: make ddl/function tests more stable across environments
Andres Freund [Mon, 8 Sep 2014 15:41:55 +0000 (17:41 +0200)]
bdr: Test ALTER SEQUENCE USING support
Craig Ringer [Tue, 12 Aug 2014 05:45:21 +0000 (13:45 +0800)]
bdr: Make ddl/function test independent of login username
Create two new users, 'super' and 'nonsuper', during init. These
may be used for any test.
Run ddl/function tests as the 'super' user instead of the user
with the same name as the current unix user that pg_regress
will otherwise pick, so output is stable.
Craig Ringer [Tue, 12 Aug 2014 05:09:27 +0000 (13:09 +0800)]
bdr: Don't attempt to use get_database_name unsafely
The prior commit "bdr: Show database name when a DDL lock error is encountered"
added use of get_database_name(..) to clarify an error message. However, it's not
always safe to access the catalogs from this call site.
Remove the unsafe call and document the issue in the code so it won't be added
back later.
Craig Ringer [Tue, 12 Aug 2014 05:04:24 +0000 (13:04 +0800)]
bdr: Remove bdr: prefixes from messages
Petr Jelinek [Mon, 11 Aug 2014 22:42:07 +0000 (00:42 +0200)]
bdr: fix typo in variable name
Petr Jelinek [Mon, 11 Aug 2014 22:39:59 +0000 (00:39 +0200)]
bdr: disallow UPDATEs and DELETEs on tables without PK
Petr Jelinek [Mon, 11 Aug 2014 16:51:27 +0000 (18:51 +0200)]
bdr: refactor apply main loop into separate function
The main loop is now in bdr_apply_work, and was moved from bdr.c to
bdr_apply.c where it belongs more naturally.
In passing start updating rollback stats counter properly.
Petr Jelinek [Mon, 11 Aug 2014 15:47:30 +0000 (17:47 +0200)]
bdr: make bdr stats work for inserts and make updates stats behave consistently
* insert count is increased every time row was added/updated
* insert conflict count is increased for every logged conflict
* update count is increased ever time row was updated
* update conflict count is increased for every logged conflict
* delete count is increased for every existing row
* delete conflict count is increased for every missing row
Petr Jelinek [Mon, 11 Aug 2014 14:24:30 +0000 (16:24 +0200)]
bdr: change loglevel for DELETE conflicts to LOG for consistency
Craig Ringer [Mon, 11 Aug 2014 11:27:00 +0000 (19:27 +0800)]
bdr: Force DateStyle in configuration
Fixes 37903
Craig Ringer [Mon, 11 Aug 2014 10:29:48 +0000 (18:29 +0800)]
bdr: Refer to bug 37904 in sleep5
Craig Ringer [Mon, 11 Aug 2014 09:44:04 +0000 (17:44 +0800)]
bdr: Make 'is not configured for bdr' error clearer
This usually actually happens when BDR is still starting up.
Craig Ringer [Mon, 11 Aug 2014 08:09:09 +0000 (16:09 +0800)]
bdr: better wait-for-start scripts
Craig Ringer [Mon, 11 Aug 2014 07:49:43 +0000 (15:49 +0800)]
bdr: Use worker type assertions in bdr_locks
Craig Ringer [Mon, 11 Aug 2014 07:31:56 +0000 (15:31 +0800)]
bdr: Introduce global bdr_worker_type to identify kind of current worker
With assertions enabled, permits easy assertion of "this is an
apply worker", "this is a per-db worker", etc.
Craig Ringer [Mon, 11 Aug 2014 06:20:03 +0000 (14:20 +0800)]
bdr: Log all steps of DDL lock acquisition, always prefix with bdr:
The DDL lock acquision logging wasn't logging all the steps, logging
at consistent log levels, or prefixing things with bdr: for easy grepping.
Add more log detail, mention node IDs when messages are received or locks are
acquired.
Comment the individual steps in lock acquisition.
Craig Ringer [Sun, 10 Aug 2014 12:49:04 +0000 (20:49 +0800)]
bdr: bump up max workers configured for isolation tests
Craig Ringer [Sun, 10 Aug 2014 12:42:30 +0000 (20:42 +0800)]
bdr: If not enough max_worker_processes are available, warn
Craig Ringer [Sun, 10 Aug 2014 08:31:57 +0000 (16:31 +0800)]
bdr: Emit a debug message when each per-db worker starts
This makes it easier to track which workers are per-db workers in the logs,
and associat their pids with other messages from the postmaster.
Craig Ringer [Sun, 10 Aug 2014 08:11:38 +0000 (16:11 +0800)]
bdr: Move the pg_sleep() into a separate "test" during isolation start
isolationtester sends the setup block as a multistatement, which causes
confusion if we need to sleep on all nodes to let DDL locking get ready, but
the statement after the sleep is DDL.
Do a fake "test" that sleeps on all nodes first, instead.
This should not be necessary. It is a workaround for startup races.
Craig Ringer [Fri, 8 Aug 2014 06:11:53 +0000 (14:11 +0800)]
bdr: In isolationtester make target, build regress and iso submake
Petr Jelinek [Fri, 8 Aug 2014 12:51:44 +0000 (14:51 +0200)]
bdr: DDL tests update
* move sequence tests and function tests to separate file
* add CREATE TABLE INHERIT tests
* add CREATE RULE tests
* add ALTER FUNCTION tests
* add SCHEMA object tests
* add VIEW object tests
Petr Jelinek [Fri, 8 Aug 2014 12:49:33 +0000 (14:49 +0200)]
bdr: git ignore bdr_pgbench_check
Craig Ringer [Thu, 7 Aug 2014 12:10:21 +0000 (20:10 +0800)]
bdr: Initial setup of isolation tester for bdr
At this point this only adds the Makefile infrastructure, config file, and a
first script to make sure the cluster is up and ready.
It uses DBs named node1, node2 and node3, and does an init_replica from node1
to the other two.
It's causing memory corruption, so that needs to be figured out before things
can proceed.
Craig Ringer [Thu, 7 Aug 2014 12:09:04 +0000 (20:09 +0800)]
bdr: Improve error message when slot creation rejected due to startup
This is just a small improvement to the error mesage emitted by the
output plugin when slot creation is rejected because the local end
isn't ready yet.
Craig Ringer [Thu, 7 Aug 2014 10:25:20 +0000 (18:25 +0800)]
bdr: Show database name when a DDL lock error is encountered
Fixes an error message like:
ERROR: database 16384 is not configured for bdr
that was rather unhelpful.
Christoph Monech-Tegeder [Thu, 7 Aug 2014 10:28:00 +0000 (12:28 +0200)]
bdr: repair tests after pg_xlog_wait* changes
Craig Ringer [Thu, 7 Aug 2014 04:04:30 +0000 (12:04 +0800)]
bdr: Document DELETE conflict case and produce a better error message
Craig Ringer [Wed, 6 Aug 2014 14:30:13 +0000 (22:30 +0800)]
bdr: A bit more README.developers on the extension control file
Craig Ringer [Wed, 6 Aug 2014 13:55:59 +0000 (21:55 +0800)]
bdr: Add a readme for devs, with release policy info
Craig Ringer [Wed, 6 Aug 2014 09:03:47 +0000 (17:03 +0800)]
bdr: Allow BDR to be built with PGXS
We need this so that we can deal with upgrades, per #37884
Petr Jelinek [Mon, 4 Aug 2014 23:52:32 +0000 (01:52 +0200)]
bdr: tests for range types
Petr Jelinek [Mon, 4 Aug 2014 23:22:39 +0000 (01:22 +0200)]
bdr: make tests work again after previous psql patch
Petr Jelinek [Mon, 4 Aug 2014 12:50:17 +0000 (14:50 +0200)]
bdr: make pgbenchcheck update
Better interface, optional zigzag RUN MODE.
Petr Jelinek [Mon, 4 Aug 2014 10:42:27 +0000 (12:42 +0200)]
bdr: add high tps test
Available under make pgbenchcheck target.
Runs pgbench on one or both master nodes and afterwards checks data
integrity. Default run time is over 5h, can be overriden using make
pgbenchcheck RUNTIME=<seconds>.
Petr Jelinek [Wed, 30 Jul 2014 23:31:52 +0000 (01:31 +0200)]
bdr: fix old tuple logging after table structure change.
The tuple_to_stringinfo function which used for logging tuple contents
was calling fastgetattr to get value of an attribute, but fastgetattr
does not work correctly on tuples that were written before table
structure change (column is added or dropped). Fix is to use
heap_getattr.
Regression test is attached as well.
Per report by Keaton Adams (RT-#37869)
Christoph Moench-Tegeder [Mon, 28 Jul 2014 17:58:26 +0000 (19:58 +0200)]
bdr: fix command tag passing
Petr Jelinek [Mon, 28 Jul 2014 22:14:49 +0000 (00:14 +0200)]
bdr: add command filters for unsupported bdr sequence DDL + tests
Petr Jelinek [Mon, 28 Jul 2014 17:00:24 +0000 (19:00 +0200)]
bdr: bit better query for finding start_value for bdr sequence.
Petr Jelinek [Mon, 28 Jul 2014 15:41:33 +0000 (17:41 +0200)]
bdr: support start param in CREATE SEQUENCE USING bdr. (RT-#37861)
Petr Jelinek [Mon, 28 Jul 2014 12:06:13 +0000 (14:06 +0200)]
bdr: more ddl tests.
* CREATE UNLOGGED TABLE
* CREATE FUNCTION
* CREATE TRIGGER
Petr Jelinek [Mon, 28 Jul 2014 12:04:15 +0000 (14:04 +0200)]
bdr: git ignore vim backup files
Petr Jelinek [Mon, 28 Jul 2014 11:55:17 +0000 (13:55 +0200)]
bdr: Update of dml tests.
* remove the unneeded basic update tests ported from repmgr
* overhaul of generic basic type tests with more types to cover fixed width passbyvalue/passbyreference and variable width types and NULL handling
* rename delete_extended.sql to extended.sql and add UPDATE tests and NULL handling tests there
* add tests for cube and hstore types (to cover fixed width and variable width contrib types)
* add COPY tests to basic types and toast tests
Craig Ringer [Mon, 28 Jul 2014 08:16:36 +0000 (16:16 +0800)]
bdr: By default suppress logging of "safe" conflicts
We were logging update conflicts on both sides, both the upstream and
downstream ends. This produced a complete record but also resulted
in extremely heavy CONFLICT log spam after node bring-up (see #37864).
Andres Freund [Fri, 25 Jul 2014 12:51:34 +0000 (14:51 +0200)]
bdr: Close relation descriptor when replaying a DELETE of a row without pkey
Andres Freund [Fri, 25 Jul 2014 12:47:32 +0000 (14:47 +0200)]
bdr: Don't fail while logging an UPDATE to a nonexistant row
Due to a typo(?) we tried to log the old tuple - which wasn't found -
instead of the tuple we tried to find.
RT-#37811
Petr Jelinek [Fri, 25 Jul 2014 14:41:04 +0000 (16:41 +0200)]
bdr: More DDL tests.
* CREATE (non-bdr) SEQUENCE
* CREATE/DROP INDEX CONCURRENTLY
* One more ADD COLUMN NOT NULL DEFAULT
Christoph Moench-Tegeder [Thu, 24 Jul 2014 15:46:20 +0000 (17:46 +0200)]
bdr: basic UPDATE tests
Andres Freund [Tue, 22 Jul 2014 11:40:36 +0000 (13:40 +0200)]
bdr: Demote log messages about wal messages from LOG to DEBUG1
Christoph Moench-Tegeder [Wed, 23 Jul 2014 14:44:47 +0000 (16:44 +0200)]
bdr: add DELETE tests for complex datatypes
also, cleanup at the end of each test
Andres Freund [Tue, 22 Jul 2014 11:36:57 +0000 (13:36 +0200)]
bdr: Load btree_gist during bdr.so's _PG_init() for easier debugging.
Several people forgot to install btree_gist and had to debug it by
looking in postgresql's log file.
Christoph Moench-Tegeder [Tue, 22 Jul 2014 08:17:08 +0000 (10:17 +0200)]
bdr: remove pg_sleep() from delete_pk test
it has been moved to the init "test"
Christoph Moench-Tegeder [Tue, 22 Jul 2014 07:08:18 +0000 (09:08 +0200)]
bdr: first sequence tests
Andres Freund [Mon, 21 Jul 2014 23:05:10 +0000 (01:05 +0200)]
bdr: Don't print update conflicts using the indexes relation descriptor.
For a long while now we're using the heap's and not the indexes
relation descriptor. But apparently do_log_update() didn't get the
memo.
That happened to work fine for all our testcases because the indexes
and the heap definition were sufficiently compatible - but it's far
from guaranteed to work.
Per report from Keaton Adams.
Andres Freund [Mon, 21 Jul 2014 23:04:48 +0000 (01:04 +0200)]
bdr: Make #ifdef VERBOSE_UPDATE code compile again
Alvaro Herrera [Mon, 21 Jul 2014 21:14:50 +0000 (17:14 -0400)]
bdr: don't use a global var after clobbering it
This was reported as #37825 whereby a sequence is not dropped after
creating it for a serial column; but the problem was much more general
than that. Essentially, any time a table was created, the rest of the
command list was ignored because creating the truncate trigger reset
SPI_nprocessed, causing the outer loop to exit early.
Fix by using a local copy of the global variable. Also, add the
originally reported problem as a test case in ddl/create.sql.
Christoph Moench-Tegeder [Mon, 21 Jul 2014 17:33:58 +0000 (19:33 +0200)]
bdr: testcase for #37826
Christoph Moench-Tegeder [Mon, 21 Jul 2014 08:05:20 +0000 (10:05 +0200)]
bdr: basic dml test: CREATE TABLE, INSERT, DELETE, DROP TABLE
Andres Freund [Mon, 21 Jul 2014 18:10:33 +0000 (20:10 +0200)]
bdr: Fix typo sometimes causing replication of UPDATE/DELETE to fail
The BdrTupeData's isnull array was accessed off-by-one which caused
bogus scankeys to be generated if the next column was null but the
current column shouldn't hve been.
Andres Freund [Mon, 21 Jul 2014 17:55:00 +0000 (19:55 +0200)]
bdr: Move test initialization into its own .sql file.
that way we can get rid of redundant pg_sleep()s but still run
individual tests alone (by running init and the the individual test).
Andres Freund [Fri, 18 Jul 2014 14:43:03 +0000 (16:43 +0200)]
bdr: Don't accidentally forward replication progress for InvalidRepNodeId
Since "bdr: Send remote transaction origin (sysid,tlid,dboid) at
BEGIN" remote transactions without a specifically specified origin
accidentally also have advanced replication identifier progress for
the the node '0' (which actually means invalid).
Add check to prevent that.
Andres Freund [Fri, 18 Jul 2014 10:19:33 +0000 (12:19 +0200)]
bdr: Add regression tests for CREATE/DROP extension replication.
Craig Ringer [Fri, 18 Jul 2014 04:32:41 +0000 (12:32 +0800)]
bdr: Use --with-extra-version in bdr quickstart script
Craig Ringer [Tue, 15 Jul 2014 01:15:58 +0000 (09:15 +0800)]
bdr: Fix non-vpath "make check"
in-tree "make check" was failing because:
ln -fs $(top_srcdir)/contrib/bdr/pg_hba.conf .
fails with:
ln: ‘../../contrib/bdr/pg_hba.conf’ and ‘./pg_hba.conf’ are the same file
Test for the existence of the file and link only if missing instead.
Craig Ringer [Tue, 15 Jul 2014 01:12:07 +0000 (09:12 +0800)]
bdr: ignore generated bdr_version.h
Andres Freund [Mon, 14 Jul 2014 06:52:27 +0000 (08:52 +0200)]
bdr: Add minimal tests for updates of toasted columns.
Andres Freund [Mon, 14 Jul 2014 06:50:40 +0000 (08:50 +0200)]
bdr: Don't uselessly form heap tuples to just deform them again.
Every caller of read_tuple() iterated over the formed HeapTuple's
column shortly afterwards. Just use read_tuple_parts() and only build
a full HeapTuple when necessary.
Andres Freund [Mon, 14 Jul 2014 05:10:51 +0000 (07:10 +0200)]
bdr: Initialize unchanged columns in read_tuple_parts() as NULL.
The previous behaviour was arguably ok and lead to easier to notice
bugs, but using it would have required changing every caller. As the
code stands there's several callers of read_tuple_parts() that just
care about columns that are guaranteed to be included.
RT-#37805
Craig Ringer [Mon, 14 Jul 2014 02:59:38 +0000 (10:59 +0800)]
bdr: Validate connection names to reject names containing _
Craig Ringer [Mon, 14 Jul 2014 02:49:13 +0000 (10:49 +0800)]
bdr: There's no bdr.bdr_connections, it's bdr.connections
Andres Freund [Fri, 4 Jul 2014 16:44:28 +0000 (18:44 +0200)]
bdr: Significantly improve wal flushing communication.
The feedback behaviour of apply processes with senders was far from
optimal: When bdr.synchronous_commit was set to off we reported LSNs
as flushed that weren't. And without bdr.synchronous_commit=off replay
was slow unless fsync()s are fast. That's because there's no
parallelism during apply.
Improve reporting by keeping a in-memory list associating remote and
local LSNs so we can check up to where we've already flushed.
Also only accept a boolean for bdr.synchronous_commit=on/off.
Andres Freund [Fri, 4 Jul 2014 12:01:25 +0000 (14:01 +0200)]
bdr: Improve setup for make check
Vpath builds where missing pg_hba.conf and the results/ddl directory
wasn't created during the build.
Andres Freund [Fri, 4 Jul 2014 11:56:48 +0000 (13:56 +0200)]
bdr: Determine quorum for ddl locks and sequencer votes somewhat reasonably.
Previously we were the overall number of connections to remote
databases to determine the amount of nodes. Unsurprisingly that falls
short if there's more than one database configured for replication.
RT-#37781
RT-#37765
Petr Jelinek [Fri, 4 Jul 2014 11:53:37 +0000 (13:53 +0200)]
bdr: bdr_init_copy: fix connection parsing and memory corruption bugs
RT-#37784, reported by Craig
Petr Jelinek [Fri, 4 Jul 2014 10:21:12 +0000 (12:21 +0200)]
bdr: remove unused bdr_get_connection_config function
Christoph Moench-Tegeder [Fri, 4 Jul 2014 06:05:55 +0000 (08:05 +0200)]
bdr: use "more flexible" shebang for shell script
bash may be installed in locations other than /bin (think /usr/local/bin/),
and using a generic /bin/sh would not work with the current bdr_initial_load.
Andres Freund [Thu, 3 Jul 2014 23:39:52 +0000 (01:39 +0200)]
bdr: Fix silly mistakes in the merge of worker and output plugins
Andres Freund [Thu, 3 Jul 2014 16:00:27 +0000 (18:00 +0200)]
bdr: move to version 0.6
Andres Freund [Thu, 3 Jul 2014 11:59:56 +0000 (13:59 +0200)]
bdr: Merge worker and output plugin shared objects.
The output plugin API used to require a separate shared object for
output plugins when _PG_init() did significant stuff. That's not the
case anymore, so get rid of that split.
Andres Freund [Thu, 3 Jul 2014 11:18:30 +0000 (13:18 +0200)]
bdr: Improve sequence debug messages a bit.
Andres Freund [Thu, 3 Jul 2014 11:04:14 +0000 (13:04 +0200)]
bdr: Improve handling of non-PANIC crashes of the perdb worker.
The ddl locking subsystem sent out a message to release all ddl locks
when the perdb worker was restarted. That can, among others, happen if
a DDL lock clashes with the sequencer.
RT-#37787, reported by Petr Jelinek
Andres Freund [Thu, 3 Jul 2014 09:58:55 +0000 (11:58 +0200)]
bdr: Don't error out when replicating rows with NULLs in unique keys.
At an earlier point build_index_scan_key() never had to deal with
indexes where keys can contain NULL. Instead of erroring out properly
set SK_ISNULL and return the information that the scankey returns
NULLs to the caller.
Current callers can just skip indexes with NULLs.
Craig Ringer [Wed, 2 Jul 2014 05:57:26 +0000 (13:57 +0800)]
bdr: Small usage tip for bdr_init_copy