repl_procedures.tar.gz contains a set of replication example scripts. Each script contains a combination of GT.M commands that accomplish a specific task. All examples in the Procedures section use these replication scripts but each example uses different script sequence and script arguments. Always run all replication examples in a test system from a new directory as they create sub-directories and database files in the current directory. No claim of copyright is made with regard to these examples. These example scripts are for explanatory purposes and are not intended for production use. YOU MUST UNDERSTAND, AND APPROPRIATELY ADJUST THE COMMANDS GIVEN IN THESE SCRIPTS BEFORE USING IN A PRODUCTION ENVIRONMENT. Typically, you would set replication between instances on different systems/data centers and create your own set of scripts with appropriate debugging and error handling to manage replication between them. Click Download repl_procedures.tar.gz to download repl_procedures.tar.gz on a test system.

repl_procedures.tar.gz includes the following scripts:

On A:

[Note]Note

Note that while a backed up instance file helps start replication on the P side of A→P, it does not prevent the need for taking a backup of the database on A. You need to do a database backup/restore or an extract/load from A to P to ensure P has all of the data as on A at startup.

On P:

Example:

source ./gtmenv A V6.3-000A_x86_64
./db_create
./repl_setup
./originating_start A P 4000
./backup_repl startA
source ./gtmenv P V6.3-000A_x86_64
./db_create
./suppl_setup P startA 4000 -updok
./repl_status
# For subsequent Receiver Server startup for P, use:
# ./replicating_start_suppl_n P 4000 -updok -autorollback
# or 
#./rollback 4000 backward
#./replicating_start_suppl_n P 4000 -updok

The shutdown sequence is as follows:

source ./gtmenv P V6.3-000A_x86_64
./replicating_stop
source ./gtmenv A V6.3-000A_x86_64
./originating_stop

The more common scenario for bringing up a replicating instance is to take a backup of the originating instance and bring it up as a replicating instance. If the backup is a comprehensive backup, the file headers store the journal sequence numbers.

On A:

On the backed up instance:

Example:

The following example demonstrates starting a replicating instance from the backup of an originating instance in an A→B replication configuration. Note that you do not need to perform an -updateresync when starting a BC replicating instance for the first time.

source ./gtmenv A V6.3-000A_x86_64
./db_create
./repl_setup
./originating_start A backupA 4001
./backup_repl startingA   #Preserve the backup of the replicating instance file that represents the state at the time of starting the instance. 
$gtm_dist/mumps -r %XCMD 'for i=1:1:10 set ^A(i)=i'
mkdir backupA   
$gtm_dist/mupip backup -replinst=currentstateA -newjnlfile=prevlink -bkupdbjnl=disable DEFAULT backupA
source ./gtmenv backupA V6.3-000A_x86_64
./db_create
./repl_setup
cp currentstateA backupA/gtm.repl
$gtm_dist/mupip replicate -editinstance -name=backupA backupA/gtm.repl 
./replicating_start backupA 4001 
./repl_status

The shutdown sequence is as follows:

source ./gtmenv backupA V6.3-000A_x86_64
./replicating_stop
source ./gtmenv A V6.3-000A_x86_64
./originating_stop

The following example demonstrates starting a replicating instance from the backup of an originating instance in an A→P replication configuration. Note that you need to perform an -updateresync to start a supplementary instance for the first time.

source ./gtmenv A V6.3-000A_x86_64
./db_create
./repl_setup
./originating_start A backupA 4011
./backup_repl startingA   
$gtm_dist/mumps -r %XCMD 'for i=1:1:10 set ^A(i)=i'
./backup_repl currentstateA
mkdir backupA
$gtm_dist/mupip backup -newjnlfile=prevlink -bkupdbjnl=disable DEFAULT backupA
source ./gtmenv backupA V6.3-000A_x86_64
./db_create
./suppl_setup backupA currentstateA 4011 -updok
./repl_status

The shutdown sequence is as follows:

source ./gtmenv backupA V6.3-000A_x86_64
./replicating_stop
source ./gtmenv A V6.3-000A_x86_64
./originating_stop

A switchover is the procedure of switching the roles of an originating instance and a replicating instance. A switchover is necessary for various reasons including (but not limited to) testing the replicating instance preparedness to take over the role of an originating instance or bringing the originating instance down for maintenance in a way that there is minimal impact on application availability.

In an A->B replication configuration, at any given point there can be two possibilities:

The steps described in this section perform a switchover (A->B becomes B->A) under both these possibilities. When A is ahead of B, these steps generate a lost transaction file which must be applied to the new originating instance as soon as possible. The lost transaction file contains transactions which are were not replicated to B. Apply the lost transactions on the new originating instance either manually or in a semi-automated fashion using the M-intrinsic function $ZQGBLMOD(). If you use $ZQGBLMOD(), perform two additional steps (mupip replicate -source -needrestart and mupip replicate -source -losttncomplete) as part of lost transaction processing. Failure to run these steps can cause $ZQGBLMOD() to return false negatives that in turn can result in application data consistency issues.

First, choose a time when there are no database updates or the rate of updates are low to minimize the chances that your application may time out. There may be a need to hold database updates briefly during the switchover. For more information on holding database updates, refer to Instance Freeze section to configure an appropriate freezing mechanism suitable for your environment.

In an A→B replication configuration, follow these steps:

On A:

On B:

On A:

The following example runs a switchover in an A→B replication configuration.

source ./gtmenv A V6.3-000A_x86_64 # creates a simple environment for instance A
./db_create
./repl_setup # enables replication and creates the replication instance file
./originating_start A B 4001 # starts the active Source Server (A->B)
$gtm_dist/mumps -r %XCMD 'for i=1:1:100 set ^A(i)=i'
./repl_status #-SHOWBACKLOG and -CHECKHEALTH report
source ./gtmenv B V6.3-000A_x86_64 # creates a simple environment for instance B
./db_create
./repl_setup
./replicating_start B 4001 
./repl_status # -SHOWBACKLOG and -CHECKHEATH report 
./replicating_stop # Shutdown the Receiver Server and the Update Process 
source ./gtmenv A V6.3-000A_x86_64 # Creates an environment for A
$gtm_dist/mumps -r %XCMD 'for i=1:1:50 set ^losttrans(i)=i' # perform some updates when replicating instance is not available. 
sleep 2
./originating_stop # Stops the active Source Server 
source ./gtmenv B V6.3-000A_x86_64 # Create an environment for B
./originating_start B A 4001 # Start the active Source Server (B->A)
source ./gtmenv A V6.3-000A_x86_64 # Create an environment for A
./rollback 4001 backward
./replicating_start A 4001 # Start the replication Source Server 
./repl_status # To confirm whether the Receiver Server and the Update Process started correctly.
cat A/gtm.lost 

The shutdown sequence is as follows:

source ./gtmenv A V6.3-000A_x86_64
./replicating_stop
source ./gtmenv B V6.3-000A_x86_64
./originating_stop

The following scenario demonstrates a switchover from B←A→P to A←B→P when A has unreplicated updates that require rollback before B can become the new originating instance.

A

B

P

Comments

O: ... A95, A96, A97, A98, A99

R: ... A95, A96, A97, A98

S: ... P34, A95, P35, P36, A96, A97, P37, P38

A as an originating primary instance at transaction number A99, replicates to B as a BC replicating secondary instance at transaction number A98 and P as a SI that includes transaction number A97, interspersed with locally generated updates. Updates are recorded in each instance's journal files using before-image journaling.

Crashes

O: ... A95, A96, A97, A98, B61

... P34, A95, P35, P36, A96, A97, P37, P38

When an event disables A, B becomes the new originating primary, with A98 as the latest transaction in its database, and starts processing application logic to maintain business continuity. In this case where P is not ahead of B, the Receiver Server at P can remain up after A crashes. When B connects, its Source Server and P"s Receiver Server confirms that B is not behind P with respect to updates received from A, and SI replication from B picks up where replication from A left off.

-

O: ... A95, A96, A97, A98, B61, B62

S: ... P34, A95, P35, P36, A96, A97, P37, P38, A98, P39, B61, P40

P operating as a supplementary instance to B replicates transactions processed on B, and also applies its own locally generated updates. Although A98 was originally generated on A, P received it from B because A97 was the common point between B and P.

... A95, A96, A97, A98, A99

O: ... A95, A96, A97, A98, B61, B62, B63, B64

S: ... P34, A95, P35, P36, A96, A97, P37, P38, A98, P39, B61, P40, B62, B63

P, continuing as a supplementary instance to B, replicates transactions processed on B, and also applies its own locally generated updates. A meanwhile has been repaired and brought online. It has to roll transaction A99 off its database into an Unreplicated Transaction Log before it can start operating as a replicating secondary instance to B.

R: ... A95, A96, A97, A98, B61, B62, B63, B64

O: ... A95, A96, A97, A98, B61, B62, B63, B64, B65

S: ... P34, A95, P35, P36, A96, A97, P37, P38, A98, P39, B61, P40, B62, B63, P41, B64

Having rolled off transactions into an Unreplicated Transaction Log, A can now operate as a replicating secondary instance to B. This is normal BC Logical Multi-Site operation. B and P continue operating as originating primary instance and supplementary instance.

The following example creates this switchover scenario:

source ./gtmenv A V6.3-000A_x86_64
$gtm_dist/mumps -r ^%XCMD 'set ^A(98)=99'
source ./gtmenv B V6.3-000A_x86_64
./replicating_stop
source ./gtmenv A V6.3-000A_x86_64
$gtm_dist/mumps -r ^%XCMD 'set ^A(99)=100'
./originating_stop
source ./gtmenv B V6.3-000A_x86_64
./originating_start B A 4010
./originating_start B P 4011
./backup_repl startB
$gtm_dist/mumps -r ^%XCMD 'set ^B(61)=0'
source ./gtmenv P V6.3-000A_x86_64
./suppl_setup M startB 4011 -updok
$gtm_dist/mumps -r ^%XCMD 'for i=39:1:40 set ^P(i)=i'
source ./gtmenv B V6.3-000A_x86_64
$gtm_dist/mumps -r ^%XCMD 'set ^B(62)=1,^B(63)=1'
source ./gtmenv A V6.3-000A_x86_64
./rollback 4010 backward
./replicating_start A 4010
source ./gtmenv B V6.3-000A_x86_64
$gtm_dist/mumps -r ^%XCMD 'set ^B(64)=1,^B(65)=1'
cat A/gtm.lost

The shutdown sequence is as follows:

source ./gtmenv B V6.3-000A_x86_64
./originating_stop
source ./gtmenv A V6.3-000A_x86_64 
./replicating_stop
source ./gtmenv P V6.3-000A_x86_64
./replicating_stop

The following demonstrates a switchover scenario from B←A→P to A←B→P where A and P have unreplicated updates that require rollback before B can become the new originating instance.

A

B

P

Comments

O: ... A95, A96, A97, A98, A99

R: ... A95, A96, A97

S: ... P34, A95, P35, P36, A96, A97, P37, P38, A98, P39, P40

A as an originating primary instance at transaction number A99, replicates to B as a BC replicating secondary instance at transaction number A97 and P as a SI that includes transaction number A98, interspersed with locally generated updates. Updates are recorded in each instance's journal files using before-image journaling.

Crashes

O: ... A95, A96, A97

... P34, A95, P35, P36, A96, A97, P37, P38, A98, P39, P40

When an event disables A, B becomes the new originating primary, with A97 the latest transaction in its database. P cannot immediately start replicating from B because the database states would not be consistent - while B does not have A98 in its database and its next update may implicitly or explicitly depend on that absence, P does, and may have relied on A98 to compute P39 and P40.

-

O: ... A95, A96, A97, B61, B62

S: ... P34, A95, P35, P36, A96, A97, P37, P38, B61

For P to accept replication from B, it must roll off transactions generated by A, (in this case A98) that B does not have in its database, as well as any additional transactions generated and applied locally since transaction number A98 from A.[a] This rollback is accomplished with a MUPIP JOURNAL -ROLLBACK -FETCHRESYNC operation on P.[b] These rolled off transactions (A98, P39, P40) go into the Unreplicated Transaction Log and can be subsequently reprocessed by application code.[c] Once the rollback is completed, P can start accepting replication from B.[d] B in its Originating Primary role processes transactions and provides business continuity, resulting in transactions B61 and B62.

-

O: ... A95, A96, A97, B61, B62, B63, B64

S: ... P34, A95, P35, P36, A96, A97, P37, P38, B61, B62, P39a, P40a, B63

P operating as a supplementary instance to B replicates transactions processed on B, and also applies its own locally generated updates. Note that P39a & P40a may or may not be the same updates as the P39 & P40 previously rolled off the database.

[a] As this rollback is more complex, may involve more data than the regular LMS rollback, and may involve reading journal records sequentially; it may take longer.

[b] In scripting for automating operations, there is no need to explicitly test whether B is behind P - if it is behind, the Source Server will fail to connect and report an error, which automated shell scripting can detect and effect a rollback on P followed by a reconnection attempt by B. On the other hand, there is no harm in P routinely performing a rollback before having B connect - if it is not ahead, the rollback will be a no-op. This characteristic of replication is unchanged from releases prior to V5.5-000.

[c] GT.M's responsibility for them ends once it places them in the Unreplicated Transaction Log.

[d] Ultimately, business logic must determine whether the rolled off transactions can simply be reapplied or whether other reprocessing is required. GT.M's $ZQGBLMOD() function can assist application code in determining whether conflicting updates may have occurred.

The following example creates this scenario.

source ./gtmenv A V6.3-000A_x86_64
./db_create
./repl_setup
./originating_start A B 4010
./originating_start A P 4011
./backup_repl startA
$gtm_dist/mumps -r ^%XCMD 'for i=1:1:97 set ^A(i)=i'
source ./gtmenv B V6.3-000A_x86_64
./db_create
./repl_setup
./replicating_start B 4010
source ./gtmenv P V6.3-000A_x86_64
./db_create
./suppl_setup P startA 4011 -updok
$gtm_dist/mumps -r ^%XCMD 'for i=1:1:40 set ^P(i)=i'
source ./gtmenv B V6.3-000A_x86_64 
./replicating_stop
source ./gtmenv A V6.3-000A_x86_64
$gtm_dist/mumps -r ^%XCMD 'set ^A(98)=99'
source ./gtmenv P V6.3-000A_x86_64
./replicating_stop 
source ./gtmenv A V6.3-000A_x86_64
$gtm_dist/mumps -r ^%XCMD 'set ^A(99)=100'
./originating_stop
source ./gtmenv B V6.3-000A_x86_64
./originating_start B A 4010
./originating_start B P 4011
./backup_repl startB
$gtm_dist/mumps -r ^%XCMD 'set ^B(61)=0,^B(62)=1'
source ./gtmenv P V6.3-000A_x86_64
./rollback 4011 backward
./suppl_setup P startB 4011 -updok
$gtm_dist/mumps -r ^%XCMD 'for i=39:1:40 set ^P(i)=i'
source ./gtmenv A V6.3-000A_x86_64
./rollback 4010 backward
./replicating_start A 4010
source ./gtmenv B V6.3-000A_x86_64
$gtm_dist/mumps -r ^%XCMD 'set ^B(64)=1,^B(65)=1'
cat A/gtm.lost
cat P/gtm.lost

The shutdown sequence is as follows:

source ./gtmenv B V6.3-000A_x86_64
./originating_stop
source ./gtmenv A V6.3-000A_x86_64 
./replicating_stop
source ./gtmenv P V6.3-000A_x86_64
./replicating_stop

The following scenario demonstrates a switchover from B←A→P to A←B→P when A and P have unreplicated updates. By application design, unreplicated updates on P do not require rollback when B becomes the new originating instance.

A

B

P

Comments

O: ... A95, A96, A97, A98, A99

R: ... A95, A96, A97

S: ... P34, A95, P35, P36, A96, A97, P37, P38, A98, P39, P40

A as an originating primary instance at transaction number A99, replicates to B as a BC replicating secondary instance at transaction number A97 and P as a SI that includes transaction number A98, interspersed with locally generated updates. Updates are recorded in each instance's journal files using before-image journaling.

Crashes

O: ... A95, A96, A97, B61, B62

... P34, A95, P35, P36, A96, A97, P37, P38, A98, P39, P40

When an event disables A, B becomes the new originating primary, with A97 the latest transaction in its database and starts processing application logic. Unlike the previous example, in this case, application design permits (or requires) P to start replicating from B even though B does not have A98 in its database and P may have relied on A98 to compute P39 and P40.

-

O: ... A95, A96, A97, B61, B62

S: ... P34, A95, P35, P36, A96, A97, P37, P38, A98, P39, P40, B61, B62

With its Receiver Server started with the -noresync option, P can receive a SI replication stream from B, and replication starts from the last common transaction shared by B and P. Notice that on B no A98 precedes B61, whereas it does on P, i.e., P was ahead of B with respect to the updates generated by A.

The following example creates this scenario.

source ./gtmenv A V6.3-000A_x86_64
./db_create
./repl_setup
./originating_start A B 4010
./originating_start A P 4011
./backup_repl startA
$gtm_dist/mumps -r ^%XCMD 'for i=1:1:97 set ^A(i)=i'
source ./gtmenv B V6.3-000A_x86_64
./db_create
./repl_setup
./replicating_start B 4010 
source ./gtmenv P V6.3-000A_x86_64
./db_create
./suppl_setup P startA 4011 -updok
$gtm_dist/mumps -r ^%XCMD 'for i=1:1:40 set ^P(i)=i'
source ./gtmenv B V6.3-000A_x86_64 
./replicating_stop
source ./gtmenv A V6.3-000A_x86_64
$gtm_dist/mumps -r ^%XCMD 'set ^A(98)=99'
source ./gtmenv P V6.3-000A_x86_64
./replicating_stop 
source ./gtmenv A V6.3-000A_x86_64
$gtm_dist/mumps -r ^%XCMD 'set ^A(99)=100'
./originating_stop
source ./gtmenv B V6.3-000A_x86_64
./originating_start B A 4010
./originating_start B P 4011
#./backup_repl startB
$gtm_dist/mumps -r ^%XCMD 'set ^B(61)=0,^B(62)=1'
source ./gtmenv P V6.3-000A_x86_64
./replicating_start_suppl_n P 4011 -updok -noresync
$gtm_dist/mumps -r ^%XCMD 'for i=39:1:40 set ^P(i)=i'
source ./gtmenv A V6.3-000A_x86_64
./rollback 4010 backward
./replicating_start A 4010 
source ./gtmenv B V6.3-000A_x86_64
$gtm_dist/mumps -r ^%XCMD 'set ^B(64)=1,^B(65)=1'

The shutdown sequence is as follows:

source ./gtmenv B V6.3-000A_x86_64
./originating_stop
source ./gtmenv A V6.3-000A_x86_64
./replicating_stop
source ./gtmenv P V6.3-000A_x86_64
./replicating_stop

This scenario demonstrates the use of the -autorollback qualifier which performs a ROLLBACK ONLINE FETCHRESYNC under the covers.

A

B

P

Comments

O: ... A95, A96, A97, A98, A99

R: ... A95, A96, A97

S: ... P34, A95, P35, P36, A96, A97, P37, P38, A98, P39, P40

A as an originating primary instance at transaction number A99, replicates to B as a BC replicating secondary instance at transaction number A97 and P as a SI that includes transaction number A98, interspersed with locally generated updates. Updates are recorded in each instance"s journal files using before-image journaling.

R: Rolls back to A97 with A98 and A99 in the Unreplicated Transaction Log.

O: A95, A96, A97

S:Rolls back A98, P38, and P40

Instances receiving a replication stream from A can be configured to rollback automatically when A performs an online rollback by starting the Receiver Server with -autorollback. If P"s Receiver Server is so configured, it will roll A98, P39 and P40 into an Unreplicated Transaction Log. This scenario is straightforward. With the -noresync qualifier, the Receiver Server can be configured to simply resume replication without rolling back.

The following example run this scenario.

source ./gtmenv A V6.3-000A_x86_64
./db_create
./repl_setup
./originating_start A P 4000
./originating_start A B 4001
source ./gtmenv B V6.3-000A_x86_64
./db_create
./repl_setup
./replicating_start B 4001
source ./gtmenv A V6.3-000A_x86_64
./backup_repl startA 
source ./gtmenv P V6.3-000A_x86_64
./db_create
./suppl_setup P startA 4000 -updok
$gtm_dist/mumps -r %XCMD 'for i=1:1:38 set ^P(i)=i'
source ./gtmenv A V6.3-000A_x86_64
$gtm_dist/mumps -r %XCMD 'for i=1:1:97 set ^A(i)=i'
source ./gtmenv B V6.3-000A_x86_64
./replicating_stop
source ./gtmenv A V6.3-000A_x86_64
$gtm_dist/mumps -r %XCMD 'set ^A(98)=50'
source ./gtmenv P V6.3-000A_x86_64
$gtm_dist/mumps -r %XCMD 'for i=39:1:40 set ^P(i)=i'
./replicating_stop
source ./gtmenv A V6.3-000A_x86_64
$gtm_dist/mumps -r %XCMD 'set ^A(99)=100'
./originating_stop
source ./gtmenv B V6.3-000A_x86_64
./originating_start B A 4001 
./originating_start B P 4000
source ./gtmenv A V6.3-000A_x86_64
./replicating_start A 4001 -autorollback
source ./gtmenv P V6.3-000A_x86_64
#./rollback 4000 backward
./replicating_start_suppl_n P 4000 -updok -autorollback
#./replicating_start_suppl_n P 4000 -updok 

The shutdown sequence is as follows:

source ./gtmenv A V6.3-000A_x86_64
./replicating_stop
source ./gtmenv P V6.3-000A_x86_64
./replicating_stop
source ./gtmenv B V6.3-000A_x86_64
./originating_stop

Consider a situation where A and P are located in one data center, with BC replication to B and Q respectively, located in another data center. When the first data center fails, the SI replication from A to P is replaced by SI replication from B to Q. The following scenario describes a switchover from B←A→P→Q to A←B→Q→P with unreplicated updates on A and P.

A

B

P

Q

Comments

O: ... A95, A96, A97, A98, A99

R: ... A95, A96, A97, A98

S: ... P34, A95, P35, P36, A96, P37, A97, P38

R: ... P34, A95, P35, P36, A96, P37

A as an originating primary instance at transaction number A99, replicates to B as a BC replicating secondary instance at transaction number A98 and P as a SI that includes transaction number A97, interspersed with locally generated updates. P in turn replicates to Q.

Goes down with the data center

O: ... A95, A96, A97, A98, B61, B62

Goes down with the data center

... P34, A95, P35, P36, A96, P37

When a data center outage disables A, and P, B becomes the new originating primary, with A98 as the latest transaction in its database and starts processing application logic to maintain business continuity. Q can receive the SI replication stream from B, without requiring a rollback since the receiver is not ahead of the source.

-

O: ... A95, A96, A97, A98, B61, B62

-

S: ... P34, A95, P35, P36, A96, P37, A97, A98, Q73, B61, Q74, B62

Q receives SI replication from B and also applies its own locally generated updates. Although A97 and A98 were originally generated on A, Q receives them from B. Q also computes and applies locally generated updates

... A95, A96, A97, A98, A99

O: ... A95, A96, A97, A98, B61, B62, B63, B64

... P34, A95, P35, P36, A96, P37, A97,A98, P38

S: ... P34, A95, P35, P36, A96, P37, A97, A98, Q73, B61, Q74, B62, Q75, B63, Q76, B64

While B and Q, keep the enterprise in operation, the first data center is recovered. Since A has transactions in its database that were not replicated to B when the latter started operating as the originating primary instance, and since P had transactions that were not replicated to Q when the latter took over, A and P must now rollback their databases and create Unreplicated Transaction Files before receiving BC replication streams from B and Q respectively. A rolls off A99, P rolls off P38.

R: ... A95, A96, A97, B61, B62, B63, B64

O: ... A95, A96, A97, B61, B62, B63, B64, B65

R: ... P34, A95, P35, P36, A96, P37, A97, A98, Q73, B61, Q74, B62, Q75, B63, Q76, B64

S: ... P34, A95, P35, P36, A96, P37, A97, A98, Q73, B61, Q74, B62, Q75, B63, Q76, B64, Q77

Having rolled off their transactions into Unreplicated Transaction Logs, A can now operate as a BC replicating instance to B and P can operate as the SI replicating instance to Q. B and Q continue operating as originating primary instance and supplementary instance. P automatically receives M38 after applying the Unreplicated Transaction Log (from P) to Q. A and P automatically receive A99 after applying the Unreplicated Transaction Log (from A) to B.

The following example runs this scenario.

source ./gtmenv A V6.3-000A_x86_64
./db_create
./repl_setup
./originating_start A P 4000
./originating_start A B 4001
source ./gtmenv B V6.3-000A_x86_64
./db_create
./repl_setup
./replicating_start B 4001
source ./gtmenv A V6.3-000A_x86_64
./backup_repl startA 
source ./gtmenv P V6.3-000A_x86_64
./db_create
./suppl_setup P startA 4000 -updok
./backup_repl startP
./originating_start P Q 4005
source ./gtmenv Q V6.3-000A_x86_64
./db_create
./suppl_setup Q startP 4005 -updnotok
source ./gtmenv A V6.3-000A_x86_64
$gtm_dist/mumps -r ^%XCMD 'for i=1:1:96 set ^A(i)=i'
source ./gtmenv P V6.3-000A_x86_64
$gtm_dist/mumps -r ^%XCMD 'for i=1:1:37 set ^P(i)=i'
source ./gtmenv Q V6.3-000A_x86_64
./replicating_stop
source ./gtmenv P V6.3-000A_x86_64
$gtm_dist/mumps -r ^%XCMD 'set ^P(38)=1000'
./replicating_stop
source ./gtmenv A V6.3-000A_x86_64 
$gtm_dist/mumps -r ^%XCMD 'set ^A(97)=1000,^A(98)=1000'
source ./gtmenv B V6.3-000A_x86_64
./replicating_stop
source ./gtmenv A V6.3-000A_x86_64 
$gtm_dist/mumps -r ^%XCMD 'set ^A(99)=1000'
./originating_stop 
source ./gtmenv B V6.3-000A_x86_64
backup_repl startB
./originating_start B Q 4008
$gtm_dist/mumps -r ^%XCMD 'for i=1:1:62 set ^B(i)=i'
source ./gtmenv Q V6.3-000A_x86_64
./rollback 4008 backward
./suppl_setup Q startB 4008 -updok
$gtm_dist/mumps -r ^%XCMD 'for i=1:1:74 set ^Q(i)=i'
source ./gtmenv B V6.3-000A_x86_64
$gtm_dist/mumps -r ^%XCMD 'for i=63:1:64 set ^B(i)=i'
./originating_start B A 4004
source ./gtmenv A V6.3-000A_x86_64
./rollback 4004 backward
./replicating_start A 4004
source ./gtmenv Q V6.3-000A_x86_64
$gtm_dist/mumps -r ^%XCMD 'for i=75:1:76 set ^Q(i)=i'
./originating_start Q P 4007
./backup_repl startQ
source ./gtmenv P V6.3-000A_x86_64
./rollback 4007 backward
./replicating_start_suppl_n P 4007 -updnotok
source ./gtmenv Q V6.3-000A_x86_64
$gtm_dist/mumps -r ^%XCMD 'set ^Q(77)=1000'
cat A/gtm.lost
cat P/gtm.lost

The shutdown sequence is as follows:

source ./gtmenv P V6.3-000A_x86_64
./replicating_stop
source ./gtmenv A V6.3-000A_x86_64
./replicating_stop
source ./gtmenv Q V6.3-000A_x86_64
./replicating_stop
./originating_stop
source ./gtmenv B V6.3-000A_x86_64
./originating_stop

In a replication configuration, a global directory provides information to map global updates to their respective database files. As replication processes pick the state of the global directory at process startup, any change made to the global directory requires process restarts (at a minimum) to bring that change into effect. A switchover mechanism can ensure application availability while making global directory changes.

On B:

On A:

  1. Shutdown replication.

  2. If the globals you are moving have triggers, make a copy of their definitions with MUPIP TRIGGER -SELECT and delete them with MUPIP TRIGGER; note if the triggers are the same as those on B, which they normally would be for a BC instance you can just delete them and use the definitions saved on B.

  3. Update the global directory.

  4. If you are rearranging the global name spaces which do not contain any data, skip to step 7.

  5. Create a backup copy of A, turn off replication, and cut the previous links of the journal file.

  6. Use the MERGE command to copy a global from the prior to the new location. Use extended references (to the prior global directory) to refer to global in the prior location.

  7. If the globals you are moving have triggers, apply the definitions saved in step 1.

  8. Turn replication on for the region of the new global location.

  9. Make A the new replicating instance.

Perform a switchover to return to the A->B configuration. Once normal operation resumes, remove the global from the prior location (using extended references) to release space.

If a switchover mechanism is not in place and a downtime during the global directory update is acceptable, follow these steps:

On B:

  • Perform steps 1 to 9.

  • Restart the Receiver Server and the Update Process.

On A:

  • Bring down the application (or prevent new updates from getting started).

  • Perform Steps 1 to 8.

  • Restart the originating instance.

  • Restart the active Source Server.

  • Bring up the application.

This example adds the mapping for global ^A to a new database file A.dat in an A->B replication configuration.

source ./gtmenv A V6.3-000A_x86_64 
./db_create
./repl_setup
./originating_start A B 4001
source ./gtmenv B V6.3-000A_x86_64
./db_create
./repl_setup
./replicating_start B 4001
source ./gtmenv A V6.3-000A_x86_64 
$gtm_dist/mumps -r %XCMD 'for i=1:1:10 set ^A(i)=i'
./repl_status
source ./gtmenv B V6.3-000A_x86_64
./replicating_stop
cp B/gtm.gld B/prior.gld
$gtm_dist/mumps -r ^GDE @updgld
./db_create
mkdir backup_B
$gtm_dist/mupip backup "*" backup_B  -replinst=backup_B/gtm.repl
$gtm_dist/mupip set -journal=on,before_images,filename=B/gtm.mjl -noprevjnlfile -region "DEFAULT"
$gtm_dist/mumps -r %XCMD 'merge ^A=^|"B/prior.gld"|A'
$gtm_dist/mupip set -replication=on -region AREG
./originating_start B A 4001
source ./gtmenv A V6.3-000A_x86_64 
./originating_stop
./rollback 4001 backward
cat A/gtm.lost  #apply lost transaction file on A. 
./replicating_start A 4001
./replicating_stop 
cp A/gtm.gld A/prior.gld
$gtm_dist/mumps -r ^GDE @updgld
./db_create
mkdir backup_A
$gtm_dist/mupip backup "*" backup_A -replinst=backup_A/gtm.repl
$gtm_dist/mupip set -journal=on,before_images,filename=A/gtm.mjl -noprevjnlfile -region "DEFAULT"
$gtm_dist/mumps -r %XCMD 'merge ^A=^|"A/prior.gld"|A'
$gtm_dist/mupip set -replication=on -region AREG
./replicating_start A 4001
./repl_status
#Perform a switchover to return to the A->B configuration. Remove the global in the prior location to release space with a command like Kill ^A=^|"A/prior.gld"|A'. 

The shutdown sequence is as follows:

source ./gtmenv A V6.3-000A_x86_64
./replicating_stop
source ./gtmenv B V6.3-000A_x86_64
./originating_stop

A rolling software upgrade is the procedure of upgrading an instance in a way that there is minimal impact on the application uptime. An upgrade may consist of changing the underlying database schema, region(s), global directory, database version, application version, triggers, and so on. There are two approaches for a rolling upgrade. The first approach is to upgrade the replicating instance and then upgrade the originating instance. The second approach is to upgrade the originating instance first while its replicating (standby) instance acts as an originating instance.

The following two procedures demonstrate these rolling software upgrade approaches for upgrading an A→B replication configuration running an application using GT.M V6.1-000_x86_64 to GT.M V6.2-001_x86_64 with minimal (a few seconds) impact on the application downtime.

On B:

on A:

on A:

on B:

[Important]Note on Triggers

While adding triggers, bear in mind that triggers get replicated if you add them when replication is turned on. However, when you add triggers when replication is turned off, those triggers and the database updates resulting from the executing their trigger code do not get replicated.

Here is an example to upgrade A and B deployed in an A→B replication configuration from V6.1-000_x86_64 to V6.2-001_x86_84. This example uses instructions from the “Upgrade the originating instance first (A→B)” procedure.

source ./env A V6.1-000_x86_64
./db_create
./repl_setup
./originating_start A B 4001
source ./env B V6.1-000_x86_64
./db_create
./repl_setup
./replicating_start B 4001
source ./env A V6.1-000_x86_64
$gtm_dist/mumps -r %XCMD 'for i=1:1:100 set ^A(i)=i'
./status
source ./env B V6.1-000_x86_64
./replicating_stop
source ./env A V6.1-000_x86_64
./status
./originating_stop 
$gtm_dist/mupip set -replication=off -region "DEFAULT"
$gtm_dist/dse dump -f 2>&1| grep "Region Seqno"
#Perform a switchover to make B the originating instance. 
source ./env A V6.2-001_x86_64
$gtm_dist/mumps -r ^GDE exit
$gtm_dist/mupip set -journal=on,before_images,filename=A/gtm.mjl -noprevjnlfile -region "DEFAULT"
#Perform the upgrade 
$gtm_dist/dse dump -fileheader 2>&1| grep "Region Seqno"
#If Region Seqno is greater than the Region Seqno noted previously, run $gtm_dist/dse change -fileheader -req_seqno=<previously_noted_region_seqno>.
./repl_setup
#A is now upgraded to V6.2-001_x86_64 and is ready to resume the role of the originating instance. Shutdown B and reinstate A as the originating instance. 
./originating_start A B 4001
source ./env B V6.2-001_x86_64
$gtm_dist/mumps -r ^GDE exit
$gtm_dist/mupip set -journal=on,before_images,filename=B/gtm.mjl -noprevjnlfile -region "DEFAULT"
#Perform the upgrade 
$gtm_dist/dse dump -fileheader 2>&1| grep "Region Seqno"
#If Region Seqno is different, run $gtm_dist/dse change -fileheader -req_seqno=<previously_noted_region_seqno>.
$gtm_dist/dse dump -f 2>&1| grep "Region Seqno"
#If Region Seqno is greater than the Region Seqno noted previously, run $gtm_dist/dse change -fileheader -req_seqno=<previously_noted_region_seqno>.
./repl_setup
./replicating_start B 4001

The shutdown sequence is as follows:

source ./env B V6.2-001_x86_64
./replicating_stop
source ./env A V6.2-001_x86_64
./originating_stop

You do not need to create a new replication instance file except when you upgrade from a GT.M version prior to V5.5-000. Unless stated in the release notes of your GT.M version, you instance file does not need to be upgraded. If you are creating a new replication instance file for any administration purpose, remember that doing do so will remove history records which may prevent it from resuming replication with other instances. To create a new replication instance file, follow these steps:

The -updateresync qualifier indicates that instead of negotiating a mutually agreed common starting point for synchronization the operator is guaranteeing the receiving instance has a valid state that matches the source instance currently or as some point in the past. Generally this means the receiving instance has just been updated with a backup copy from the source instance.

On instances with the same endian-ness, follow these steps to create a replication instance file without using the -updateresync qualifier.

On the source side:

On the receiving side:

[Note]

When the instances have different endian-ness, create a new replication instance file as described in Creating the Replication Instance File

The following example creates two instances (Alice and Bob) and a basic framework required for setting up a TLS replication connection between them. Alice and Bob are fictional characters from https://en.wikipedia.org/wiki/Alice_and_Bob and represent two instances who use the certificates signed by the same demo root CA. This example is solely for the purpose of explaining the general steps required to encrypt replication data in motion. You must understand, and appropriately adjust, the scripts before using them in a production environment. Note that all certificates created in this example are for the sake of explaining their roles in a TLS replication environment. For practical applications, use certificates signed by a CA whose authority matches your use of TLS.

  1. Remove the comment tags from the following lines in the gtmenv script:

    export gtmcrypt_config=$PWD/$gtm_repl_instname/config_file
    echo -n "Enter Password for gtmtls_passwd_${gtm_repl_instname}: ";export gtmtls_passwd_${gtm_repl_instname}="`$gtm_dist/plugin/gtmcrypt/maskpass|tail -n 1|cut -f 3 -d " "`"
  2. Execute the gtmenv script as follows:

    $ source ./gtmenv Alice V6.2-001_x86_64

    This creates a GT.M environment for replication instance name Alice. When prompted, enter a password for gtmtls_passwd_Alice.

  3. ./db_create

    This creates the global directory and the database for instance Alice.

  4. Create a demo root CA, leaf-level certificate, and a $gtmcrypt_config file with a tlsid called Alice for instance Alice. Note that in this example, $gtmcrypt_config is set to $PWD/Alice/config_file. For more information on creating the $gtmcrypt_config file and the demo certificates required to run this example, refer to Appendix G: “Creating a $gtmcrypt_config file.

    Your $gtmcrypt_config file should look something like:

    tls: {
        verify-depth: 7;
         CAfile: "/path/to/certs/ca.crt";
         Alice: {
               format: "PEM";
               cert: "/path/to/certs/Alice.crt";
               key: "/path/to/certs/Alice.key";
         };
      };
  5. Turn replication on and create the replication instance file:

    $ ./repl_setup
  6. Start the originating instance Alice:

    $ ./originating_start Alice Bob 4001 -tlsid=Alice -reneg=2

On instance Bob:

  1. Execute the gtmenv script as follows:

    $ source ./gtmenv Bob V6.2-001_x86_64

    This creates a GT.M environment for replication instance name Bob. When prompted, enter a password for gtmtls_passwd_Bob.

  2. $ ./db_create

    This creates the global directory and the database for instance Bob.

  3. Create a leaf-level certificate and a $gtmcrypt_config file with a tlsid called Bob for instance Bob. Note that in this example, $gtmcrypt_config is set to $PWD/Bob/config_file. Note that you would use the demo CA that you created before to sign this leaf-level certificate. For replication to proceed, both leaf-level certificates must be signed by the same root CA. For more information, refer to Appendix G: “Creating a $gtmcrypt_config file.

    Your $gtmcrypt_config file should look something like:

    tls: {
        verify-depth: 7;
         CAfile: "/path/to/certs/ca.crt";
         Bob: {
               format: "PEM";
               cert: "/path/to/certs/Bob.crt";
               key: "/path/to/certs/Bob.key";
         };
      };
  4. Turn replication on and create the replication instance file:

    $ ./repl_setup
  5. Start the replicating instance Bob.

    $ ./replicating_start Bob 4001 -tlsid=Bob

For subsequent environment setup, use the following commands:

source ./gtmenv Bob V6.2-001_x86_64 or source ./gtmenv Alice V6.2-001_x86_64
./replicating_start Bob 4001 -tlsid=Bob or ./originating_start Alice Bob 4001 -tlsid=Alice -reneg=2

If you notice the replication WAS_ON state, correct the cause that made GT.M turn journaling off and then execute MUPIP SET -REPLICATION=ON.

To make storage space available, first consider moving unwanted non-journaled and temporary data. Then consider moving the journal files that predate the last backup. Moving the currently linked journal files is a very last resort because it disrupts the back links and a rollback or recover will not be able to get back past this discontinuity unless you are able to return them to their original location.

If the replication WAS_ON state occurs on the originating side:

If the Source Server does not reference any missing journal files, -REPLICATION=ON resumes replication with no downtime.

If the Source Server requires any missing journal file, it produces a REPLBRKNTRANS or NOPREVLINK error and shuts down. Note that you cannot rollback after journaling turned off because there is insufficient information to do such a rollback.

In this case, proceed as follows:

If the replication WAS_ON state occurs on the receiving side:

Execute MUPIP SET -REPLICATION=ON to return to the replication ON state. This resumes normal replication on the receiver side. As an additional safety check, extract the journal records of updates that occurred during the replication WAS_ON state on the originating instance and randomly check whether those updates are present in the receiving instance.

If replication does not resume properly (due to errors in the Receiver Server or Update Process), proceed as follows:

When a rollback operations fails with CHNGTPRSLVTM, NOPREVLINK, and JNLFILEOPENERR messages, evaluate whether you have a crashed region in your global directory that is seldom used for making updates (idle). The updates in an idle region's current generation journal file may have timestamps and sequence numbers that no longer exist in the prior generation journal file chains of more frequently updated regions because of periodic pruning of existing journal files as part of routine maintenance. MUPIP SET and BACKUP commands can also remove previous generation journal file links.

Terminating a process accessing an idle region abnormally (say with kill -9 or some other catastrophic event) may leave its journal files improperly closed. In such an case, the discrepancy may go unnoticed until the next database update or rollback. Performing a rollback including such an idle region may then resolve the unified rollback starting time, (reported with a CHNGTPRSLVTM message), to a point in time that does not exist in the journal file chain for the other regions, thus causing the rollback to fail.

In this rare but possible condition, first perform a rollback selectively for the idle region(s). Here are the steps:

You do not need to perform these steps if you have a non-replicated but journaled database because RECOVER operations do not coordinate across regions.

As a general practice, perform an optimal recovery/rollback every time when starting a GT.M application from quiescence and, depending on the circumstances, after a GT.M process terminates abnormally.

FIS recommends rotating journal files with MUPIP SET when removing old journal files or ensuring that all regions are periodically updated.

loading table of contents...