Configure GoldenGate to Purge Old Trails

Introduction

When you setup a GoldenGate replication it will generate a lot of trail files that contain the transaction activity extracted from the database or from other trails. This may become an issue for disk space availability.
That’s why you need to configure GoldenGate to automatically purge trail files when they are no longer needed:

  • They were already consumed by Replicat and data pump processes.
  • They are older enough to not be needed in case of reprocessing.

Configure Purge Extrails

The automatic purge process is activated by using the parameter PURGEOLDEXTRACTS, according to documentation:

PURGEOLDEXTRACTS: Purges trail files when Oracle GoldenGate is finished processing them. Without PURGEOLDEXTRACTS, no purging is performed and trail files can consume significant disk space. For best results, use PURGEOLDEXTRACTS as a Manager parameter, not as an Extract or Replicat parameter.

Stop Processes

We need to terminate all GoldenGate processes before stopping Manager for reconfiguration:

[golden@patogolden golden]$ ./ggsci

Oracle GoldenGate Command Interpreter for Oracle
Version 19.1.0.0.4 OGGCORE_19.1.0.0.0_PLATFORMS_191017.1054_FBO
Linux, x64, 64bit (optimized), Oracle 19c on Oct 17 2019 21:16:29
Operating system character set identified as UTF-8.

GGSCI (patogolden) 1> stop *

Sending STOP request to EXTRACT EXTPATO ...
Request processed.

Sending STOP request to REPLICAT REPATO ...
Request processed.

GGSCI (patogolden) 2> stop mgr
Manager process is required by other GGS processes.
Are you sure you want to stop it (y/n)?y

Sending STOP request to MANAGER ...
Request processed.
Manager stopped.

GGSCI (patogolden) 3> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     STOPPED
EXTRACT     STOPPED     EXTPATO     00:00:00      00:15:20
REPLICAT    STOPPED     REPATO      00:00:00      00:15:14

Manager Parameter Setup

In GoldenGate edit the parameter file with the new instruction PURGEOLDEXTRACTS including:

  • USECHECKPOINTS so the trail will be deleted only if it was already processed.
  • MINKEEPFILES to preserve some previous trails for possible reprocess.
GGSCI (patogolden) 4> edit params mgr
PORT 7809
DYNAMICPORTLIST 7810-7820
AUTOSTART ER *
PURGEOLDEXTRACTS /golden/dirdat/*, USECHECKPOINTS, MINKEEPFILES 4, FREQUENCYMINUTES 15

Start Manager and Replication

GGSCI (patogolden) 5> start mgr
Manager started.

GGSCI (patogolden) 6> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING
EXTRACT     RUNNING     EXTPATO     00:00:00      00:00:08
REPLICAT    RUNNING     REPATO      00:00:00      00:00:04

Validate Purging

First we are going to stop our Replicat so extrail files got accumulated:

GGSCI (patogolden) 7> stop replicat repato

Sending STOP request to REPLICAT REPATO ...
Request processed.

Validate USECHECKPOINT

Checking the dirdat directory we see we have 9 trail files instead of the maximum of 4 that we we setup.
This is due to the parameter USECHECKPOINT. As those trails has not been processed by the stopped Replicat they can’t be candidate for deletion:

[golden@patogolden golden]$ cd dirdat
[golden@patogolden dirdat]$ ls
tr000000021  tr000000023  tr000000025  tr000000027  tr000000029
tr000000022  tr000000024  tr000000026  tr000000028

Validate MINKEEPFILES

Lets start the Replicat and check what happens:

GGSCI (patogolden) 1> start replicat repato

Sending START request to MANAGER ...
REPLICAT REPATO starting

After some time we see the Replicat has processed all pending trails up to tr000000029:

GGSCI (patogolden) 2> info replicat repato

REPLICAT   REPATO    Last Started 2020-06-10 22:09   Status RUNNING
INTEGRATED
Checkpoint Lag       00:00:00 (updated 00:00:09 ago)
Process ID           27661
Log Read Checkpoint  File /golden/dirdat/tr000000029
                     2020-06-10 22:12:46.000000  RBA 720227

GGSCI (patogolden) 3> send mgr, getpurgeoldextracts

Sending GETPURGEOLDEXTRACTS request to MANAGER ...

PurgeOldExtracts Rules
Fileset                              MinHours MinFiles UseCP
/golden/dirdat/*                            0        4   Y
OK
Extract Trails
Filename                        Oldest_Chkpt_Seqno
/golden/dirdat/tr       29

Checking the Manager report file we can see this 5 files were purged according our rule:

[golden@patogolden dirrpt]$ tail -f MGR.rpt

2020-06-10 22:15:42  INFO    OGG-00957  Purged old extract file '/golden/dirdat/tr000000021', applying UseCheckPoints purge rule: Oldest Chkpt Seqno 29 > 21.

2020-06-10 22:15:42  INFO    OGG-00957  Purged old extract file '/golden/dirdat/tr000000022', applying UseCheckPoints purge rule: Oldest Chkpt Seqno 29 > 22.

2020-06-10 22:15:42  INFO    OGG-00957  Purged old extract file '/golden/dirdat/tr000000023', applying UseCheckPoints purge rule: Oldest Chkpt Seqno 29 > 23.

2020-06-10 22:15:42  INFO    OGG-00957  Purged old extract file '/golden/dirdat/tr000000024', applying UseCheckPoints purge rule: Oldest Chkpt Seqno 29 > 24.

2020-06-10 22:15:42  INFO    OGG-00957  Purged old extract file '/golden/dirdat/tr000000025', applying UseCheckPoints purge rule: Oldest Chkpt Seqno 29 > 25.

Finally validate in the dirdat directory:

[golden@patogolden dirdat]$ ls
tr000000026  tr000000027  tr000000028  tr000000029

we can see that even when the tr000000026, tr000000027 y tr000000028 files were already processed, they are preserved to accomplish the minimum files setup configured.

Conclusion

We configured and validated the GoldenGate Manager parameter PURGEOLDEXTRACTS and subparameters USECHECKPOINTS and MINKEEPFILES in order to automate the purging of old trail files and keep our dirdat directory always with enough free space.