From cac34e5281b46c2c0c04de5d671412f55b70c26b Mon Sep 17 00:00:00 2001 From: Steve Singer Date: Thu, 4 Jul 2013 16:40:50 -0400 Subject: [PATCH] documentation updates for log shipping Make the slony_logshipper daemon have a proper page in the reference section like slon and slonik. Clean up the rest of the logshipping instructions to be more of a set of instructions versus question/answer format. --- doc/adminguide/filelist.sgml | 2 + doc/adminguide/logshipping.sgml | 356 +++++----------------------- doc/adminguide/logshipping_ref.sgml | 87 +++++++ doc/adminguide/slony.sgml | 2 + 4 files changed, 153 insertions(+), 294 deletions(-) create mode 100644 doc/adminguide/logshipping_ref.sgml diff --git a/doc/adminguide/filelist.sgml b/doc/adminguide/filelist.sgml index 29ef6d28..f073b2f4 100644 --- a/doc/adminguide/filelist.sgml +++ b/doc/adminguide/filelist.sgml @@ -26,6 +26,7 @@ + @@ -34,6 +35,7 @@ + diff --git a/doc/adminguide/logshipping.sgml b/doc/adminguide/logshipping.sgml index 4a53a869..efe42f89 100644 --- a/doc/adminguide/logshipping.sgml +++ b/doc/adminguide/logshipping.sgml @@ -6,8 +6,8 @@ &slony1; has the ability to serialize the updates to go out into log files that can be kept in a spool directory. - The spool files could then be transferred via whatever means -was desired to a slave system, whether that be via FTP, + The spool files can then be transferred via whatever means +is desired to a slave system, whether that be via FTP, rsync, or perhaps even by pushing them onto a 1GB USB key to be sent to the destination by clipping it to the ankle of some sort of avian transport system. @@ -22,11 +22,7 @@ in this form, including: Replicating to destinations where it is not possible to set up bidirection communications - - Supporting a different form of PITR - (Point In Time Recovery) that filters out read-only transactions and - updates to tables that are not of interest. - + If some disaster strikes, you can look at the logs of queries in detail @@ -34,16 +30,12 @@ in this form, including: might not intend to actually create a log-shipped node. - This is a really slick scheme for building load for - doing tests - - We have a data escrow system that - would become incredibly cheaper given log shipping + The .SQL log files could be used to generate write activity for + doing tests You may apply triggers on the disconnected - node to do additional processing on the data - - For instance, you might take a fairly stateful + node to do additional processing on the data. For instance, + you might take a fairly stateful database and turn it into a temporal one by use of triggers that implement the techniques described in Developing Time-Oriented Database Applications in SQL @@ -52,304 +44,80 @@ in this form, including: - - - - Where are the spool files for a -subscription set generated? - - - Any slon subscriber node -can generate them by adding the option. - - Notice that this implies that in order to use log -shipping, you must have at least one subscriber node. - - - - - What takes place when a / takes -place? - - Nothing special. So long as the archiving node remains -a subscriber, it will continue to generate logs. - - If the archiving node becomes the origin, on -the other hand, it will continue to generate logs. - - - What if we run out of spool -space? - - The node will stop accepting SYNCs -until this problem is alleviated. The database being subscribed to -will also fall behind. - - - - How do we set up a subscription? - - The script in tools called -slony1_dump.sh is a shell script that dumps -the present state of the subscriber node. - - You need to start the slon - for the subscriber node with logging turned on. -At any point after that, you can run -slony1_dump.sh, which will pull the state -of that subscriber as of some SYNC event. Once the -dump completes, all the SYNC logs generated from -the time that dump started may be added to the -dump in order to get a log shipping subscriber. - - - - What are the limitations of log -shipping? - - - In the initial release, there are rather a lot of -limitations. As releases progress, hopefully some of these -limitations may be alleviated/eliminated. - - The log shipping functionality amounts to -sniffing the data applied at a particular subscriber -node. As a result, you must have at least one regular -node; you cannot have a cluster that consists solely of an origin and -a set of log shipping nodes.. - - The log shipping node tracks the -entirety of the traffic going to a subscriber. You cannot separate -things out if there are multiple replication sets. - - The log shipping node presently only -fully tracks SYNC events. This should be -sufficient to cope with some changes in cluster -configuration, but not others. + +Setting up Log Shipping - A number of event types are handled in -such a way that log shipping copes with them: - - - -SYNC events are, of course, -handled. - -DDL_SCRIPT is handled. - - UNSUBSCRIBE_SET - - This event, much like SUBSCRIBE_SET is not -handled by the log shipping code. But its effect is, namely that -SYNC events on the subscriber node will no longer -contain updates to the set. - - Similarly, SET_DROP_TABLE, -SET_DROP_SEQUENCE, -SET_MOVE_TABLE, -SET_MOVE_SEQUENCE, -DROP_SET, -MERGE_SET, -SUBSCRIBE_SET will be handled -apropriately. - - - - The various events involved in node configuration are -irrelevant to log shipping: - -STORE_NODE, -ENABLE_NODE, -DROP_NODE, -STORE_PATH, -DROP_PATH, -STORE_LISTEN, -DROP_LISTEN - - Events involved in describing how particular sets are -to be initially configured are similarly irrelevant: - -STORE_SET, -SET_ADD_TABLE, -SET_ADD_SEQUENCE, -STORE_TRIGGER, -DROP_TRIGGER, - - - + +Setting up log shipping requires that you already have a replication cluster +setup with at least two nodes and that the tables and sequences that you want to +ship data for are part of replication sets that the subscriber is subscribed to. - - - It would be nice to be able to turn a log -shipped node into a fully communicating &slony1; node that you -could failover to. This would be quite useful if you were trying to -construct a cluster of (say) 6 nodes; you could start by creating one -subscriber, and then use log shipping to populate the other 4 in -parallel. - - This usage is not supported, but presumably one could take -an application outage and promote the log-shipping node to a normal -slony node with the OMIT COPY option of SUBSCRIBE SET. - - - - - Usage Hints - - Here are some more-or-less disorganized notes about how -you might want to use log shipping... + - - You don't want to blindly apply -SYNC files because any given -SYNC file may not be the right -one. If it's wrong, then the result will be that the call to - setsyncTracking_offline() will fail, and your - psql session will ABORT -, and then run through the remainder of that -SYNC file looking for a COMMIT -or ROLLBACK so that it can try to move on to the -next transaction. - - But we know that the entire remainder of -the file will fail! It is futile to go through the parsing effort of -reading the remainder of the file. - - Better idea: - - - - The table, on the log shipped node, tracks which log -it most recently applied in table -sl_archive_tracking. - - Thus, you may predict the ID number of the next file by taking -the latest counter from this table and adding 1. - - - There is still variation as to the filename, -depending on what the overall set of nodes in the cluster are. All -nodes periodically generate SYNC events, even if -they are not an origin node, and the log shipping system does generate -logs for such events. - - As a result, when searching for the next file, it is necessary -to search for files in a manner similar to the following: - - -ARCHIVEDIR=/var/spool/slony/archivelogs/node4 -SLONYCLUSTER=mycluster -PGDATABASE=logshipdb -PGHOST=logshiphost -NEXTQUERY="select at_counter+1 from \"_${SLONYCLUSTER}\".sl_archive_tracking;" -nextseq=`psql -d ${PGDATABASE} -h ${PGHOST} -A -t -c "${NEXTQUERY}" -filespec=`printf "slony1_log_*_%20d.sql" -for file in `find $ARCHIVEDIR -name "${filespec}"; do - psql -d ${PGDATABASE} -h ${PGHOST} -f ${file} -done - - +Create your log shipping target database include your applications tables. This can be +created from your applications DDL sources or with a pg_dump -s from a slave node. If you are +using a pg_dump from a slave then you need to exclude the slony log triggers from this dump and possibly +some of your application triggers. The script tools/find-triggers-to-deactivate.sh might help with this. + +Stop the slon daemon for the slave node that you want to generate the log shipping +files from. Log shipping files are generated by a slon daemon for a slave and include the changes +that are applied to that slave for all sets that the slave is subscribed to +Run the script tools/slony1_dump.sh this script is a shell script that dumps +the present state (data) of the subscriber node. The script requires psql to be in your PATH +and that the PGHOST, PGDATABSE and PGPORT variables be set (if required) so that psql can connect +to the database you are dumping from. The name of your slony cluster should be passed as a +argument to slony1_dump.sh +The dump generated by slony1_dump.sh on standard-out should then be restored (via psql) +on the log shipping target node. This will also create a set of slony log shipping tracking tables +that the log shipping daemon will use to track which events have been processed on the log shipping target +node. +Restart the slon daemon on the slave node and pass the option followed by +directory that the log shipping files should be placed in. Alternatively you can specify a + option in your slon.conf file. - + +Slon will generate a file in the archive directory for each SYNC event that it processes. These files will +contain COPY statements that insert data into the sl_log_archive table. A trigger, that was installed by slony1_dumps.sh +will intercept these inserts and make the approriate changes to your replicated tables - <application> find-triggers-to-deactivate.sh -</application> - - log shipping - trigger deactivation - - It was once pointed out ( Bugzilla bug -#19) that the dump of a schema may include triggers and rules -that you may not wish to have running on the log shipped node. - - The tool tools/find-triggers-to-deactivate.sh - was created to assist with this task. It may be run -against the node that is to be used as a schema source, and it will -list the rules and triggers present on that node that may, in turn -need to be deactivated. - - It includes logtrigger and denyaccess -triggers which will may be left out of the extracted schema, but it is -still worth the Gentle Administrator verifying that such triggers are -kept out of the log shipped replica. + +Applying Log Files + +The .SQL files that slon places in the archive directory contain SQL commands that should be executed on the log shipping target. +These files need to be applied in the proper order. The file name of the next SQL file to apply is based on information contained +in the sl_archive_tracking table. &lslonylogshipping; is a daemon will monitor an archive directory and apply the updates +in the proper order. + + +Each .SQL file contains a a SQL COPY command that will copy the data into the sl_log_archive table where a trigger will +instead perform the proper action on the target database. The slony1_dump.sh script will create the sl_log_archive table +and setup the trigger. + - <application>slony_logshipper </application> Tool - -logshipping: slony_logshipper tool - As of version 1.2.12, &slony1; has a tool designed to help -apply logs, called slony_logshipper. It is -run with three sorts of parameters: + +Converting SQL commands from COPY to INSERT/UPDATE/DELETE - - Options, chosen from the following: - - display this help text and exit - display program version and exit - quiet mode - cause running daemon to reopen its logfile - cause running daemon to resume after error - cause running daemon to enter smart shutdown mode - cause running daemon to enter immediate shutdown mode - destroy existing semaphore set and message queue (use with caution) - stay in foreground (don't daemonize) - enter smart shutdown mode immediately - - - A specified log shipper configuration file - This configuration file consists of the following specifications: - - logfile = './offline_logs/logshipper.log'; - Where the log shipper will leave messages. - cluster name = 'T1'; Cluster name - destination database = 'dbname=slony_test3'; Optional conninfo for the destination database. If given, the log shipper will connect to this database, and apply logs to it. - archive dir = './offline_logs'; The archive directory is required when running in database-connected mode to have a place to scan for missing (unapplied) archives. - destination dir = './offline_result'; If specified, the log shipper will write the results of data massaging into result logfiles in this directory. - max archives = 3600; This fights eventual resource leakage; the daemon will enter smart shutdown mode automatically after processing this many archives. - ignore table "public"."history"; One may filter out single tables from log shipped replication - ignore namespace "public"; One may filter out entire namespaces from log shipped replication - rename namespace "public"."history" to "site_001"."history"; One may rename specific tables. - rename namespace "public" to "site_001"; One may rename entire namespaces. - post processing command = 'gzip -9 $inarchive'; Pre- and post-processing commands are executed via system(3). - - - An @ as the first character causes the exit code to be ignored. Otherwise, a nonzero exit code is treated as an error and causes processing to abort. - - Pre- and post-processing commands have two further special variables defined: - - $inarchive - indicating incoming archive filename - $outnarchive - indicating outgoing archive filename - - - - error command = ' ( echo -"archive=$inarchive" echo "error messages:" echo "$errortext" ) | mail --s "Slony log shipping failed" postgres@localhost '; - - The error command indicates a command to execute upon encountering an error. All logging since the last successful completion of an archive is available in the $errortext variable. - - In the example shown, this sends an email to the DBAs upon -encountering an error. + +Prior to &slony1; 2.2 the SQL files generated for log shipping contained INSERT/UPDATE/DELETE +statements. As of &slony1; 2.2 the log shipping files contain COPY statements. The COPY statements +should result in better performance. If you need the old style of SQL files with INSERT/UPDATE/DELETE +then the script tools/logshipping_toinsert.pl can be used to convert the COPY style log shipping files +to INSERT/UPDATE/DELETE statements. INSERT/UPDATE/DELETE statements should be easier to apply against +databases other than PostgreSQL or in environments where you can't create the sl_log_archive table. + - Archive File Names - Each filename is added to the SystemV Message queue for -processing by a slony_logshipper -process. + - - - -