Switchover en Oracle DataGuard 19c

Introducción

Uno de los principales beneficios de tener Oracle DataGuard es la posibilidad de mover la actividad de una base de datos a otra instancia con el fin de realizar algún mantenimiento en la instancia principal. Esto se logra al intercambiar los roles de las instancias participantes en el ambiente DataGuard como se describe en la documentación:

Durante un switchover, la base de datos primaria transiciona a un role de standby, y la base de datos standby transiciona al rol primario.
Un switchover garantiza que no haya pérdida de datos y se hace típicamente para mantenimientos planeados en el sistema primario.

Prerequisitos

Estamos usando un ambiente DataGuard con una instancia Primaria y una Standby Física en Oracle 19c multitenant con una base de datos pluggable. Para más información sobre cómo crear un ambiente DataGuard favor de revisar Crear una Standby Física de Oracle DataGuard.

Switchover

Para hacer un Switchover en un ambiente DataGuard tenemos que realizar los siguientes pasos:

  • En la base de datos Primaria actual:

    • alter database switchover toSTANDBY_DB_UNIQUE_NAME;
  • En la base de datos Standby anterior (nueva Primaria):

    • alter database open;
  • En la base de datos Primaria anterior (nueva Standby):

    • startup mount
    • alter database recover managed standby database disconnect from session;

Hagamos un switchover en nuestro ambiente DataGuard.

Verificar si estamos Listos para el Switchover

Primero necesitamos validar que nuestra base de datos Standby está lista para procesar el switchover usando la opción verify:

[oracle@patodgprmy ~]$ sqlplus / as sysdba

SQL> alter database switchover to STBYDB verify;

Database altered.

Podemos proceder ya que no se encontraron errores ORA- ni mensajes de alerta en la respuesta del comando.

Iniciar el Switchover

En la base de datos Primaria enviar el comando alter database switchover apuntando al nombre único de la base de datos de nuestra instancia Standby actual:

SQL> alter database switchover to STBYDB;

Database altered.

SQL> exit
Disconnected from Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.3.0.0.0

[oracle@patodgprmy ~]$

En versiones previas este paso se hacía en dos comandos:

  • en la Primaria: alter database commit to switchover to standby;
  • en la Standby: alter database commit to switchover to primary;

El switchover to standby se está realizando como podemos ver en el registro de alertas de la Primaria anterior:

2020-07-01T18:06:13.593410-05:00
alter database switchover to STBYDB
2020-07-01T18:06:13.593530-05:00
NET  (PID:5479): The Time Management Interface (TMI) is being enabled for role transition
NET  (PID:5479): information.  This will result in messages beingoutput to the alert log
NET  (PID:5479): file with the prefix 'TMI: '.  This is being enabled to make the timing of
NET  (PID:5479): the various stages of the role transition available for diagnostic purposes.
NET  (PID:5479): This output will end when the role transition is complete.
TMI: dbsdrv switchover to target BEGIN 2020-07-01 18:06:13.593944
NET  (PID:5479): Starting switchover [Process ID: 5479]
TMI: kcv_switchover_to_target convert to physical BEGIN 2020-07-01 18:06:13.671793
2020-07-01T18:06:13.672030-05:00
ALTER DATABASE COMMIT TO SWITCHOVER TO PHYSICAL STANDBY [Process Id: 5479] (prmydb)
NET  (PID:5479): Waiting for target standby to receive all redo
2020-07-01T18:06:13.681099-05:00
NET  (PID:5479): Waiting for all non-current ORLs to be archived
2020-07-01T18:06:13.681299-05:00
NET  (PID:5479): All non-current ORLs have been archived
2020-07-01T18:06:13.681454-05:00
NET  (PID:5479): Waiting for all FAL entries to be archived
2020-07-01T18:06:13.681574-05:00
NET  (PID:5479): All FAL entries have been archived
2020-07-01T18:06:13.681737-05:00
NET  (PID:5479): Waiting for LAD:2 to become synchronized
2020-07-01T18:06:14.682455-05:00
NET  (PID:5479): Active, synchronized Physical Standby switchover target has been identified
NET  (PID:5479): Preventing updates and queries at the Primary
2020-07-01T18:06:15.683620-05:00
NET  (PID:5479): Generating and shipping final logs to target standby
Switchover End-Of-Redo Log thread 1 sequence 9 has been fixed
Switchover: Primary highest seen SCN set to 0x00000000002f2008
NET  (PID:5479): Noswitch archival of T-1.S-9
NET  (PID:5479): End-Of-Redo Branch archival of T-1.S-9
NET  (PID:5479): LGWR is scheduled to archive to LAD:2 after log switch
NET  (PID:5479): SRL selected for T-1.S-9 for LAD:2
NET  (PID:5479): Archiving is disabled due to current logfile archival
Primary will check for some target standby to have received all redo
NET  (PID:5479): Waiting for target standby to apply all redo
2020-07-01T18:06:16.887346-05:00
Backup controlfile written to trace file /opt/oracle/diag/rdbms/prmydb/prmydb/trace/prmydb_ora_5479.trc
NET  (PID:5479): Converting the primary database to a new standby database
Clearing standby activation ID 2982872916 (0xb1cb0754)
The primary database controlfile was created using the
'MAXLOGFILES 16' clause.
There is space for up to 13 standby redo logfiles
Use the following SQL commands on the standby database to create
standby redo logfiles that match the primary database:
ALTER DATABASE ADD STANDBY LOGFILE 'srl1.f' SIZE 67108864;
ALTER DATABASE ADD STANDBY LOGFILE 'srl2.f' SIZE 67108864;
ALTER DATABASE ADD STANDBY LOGFILE 'srl3.f' SIZE 67108864;
ALTER DATABASE ADD STANDBY LOGFILE 'srl4.f' SIZE 67108864;
Archivelog for thread 1 sequence 9 required for standby recovery
Offline data file 5 marked as online during convert to standby or switchover to standby.
Restore of backup may be required if the file is not physically accessible.
Offline data file 6 marked as online during convert to standby or switchover to standby.
Restore of backup may be required if the file is not physically accessible.
Offline data file 8 marked as online during convert to standby or switchover to standby.
Restore of backup may be required if the file is not physically accessible.
Switchover: Primary controlfile converted to standby controlfile succesfully.
Switchover: Complete - Database shutdown required
TMI: kcv_switchover_to_target convert to physical END 2020-07-01 18:06:16.939210
NET  (PID:5479): Sending request(convert to primary database) to switchover target STBYDB
2020-07-01T18:06:21.359259-05:00
NET  (PID:5479): Switchover complete. Database shutdown required
USER (ospid: 5479): terminating the instance
2020-07-01T18:06:22.574955-05:00
Instance terminated by USER, pid = 5479
TMI: dbsdrv switchover to target END 2020-07-01 18:06:22.575031
Completed: alter database switchover to STBYDB
Shutting down ORACLE instance (abort) (OS id: 5479)
License high water mark = 32
2020-07-01T18:06:22.704849-05:00
Instance shutdown complete (OS id: 5479)

Y el switchover to primary se está realizando como podemos ver en el registro de alertas de la Standby anterior:

2020-07-01T18:06:16.968171-05:00
.... (PID:3021): The Time Management Interface (TMI) is being enabled for role transition
.... (PID:3021): information.  This will result in messages beingoutput to the alert log
.... (PID:3021): file with the prefix 'TMI: '.  This is being enabled to make the timing of
.... (PID:3021): the various stages of the role transition available for diagnostic purposes.
.... (PID:3021): This output will end when the role transition is complete.
SWITCHOVER: received request 'ALTER DATABASE COMMIT TO SWITCHOVER  TO PRIMARY' from primary database.
2020-07-01T18:06:16.969043-05:00
ALTER DATABASE SWITCHOVER TO PRIMARY (stbydb)
Maximum wait for role transition is 15 minutes.
TMI: kcv_commit_to_so_to_primary wait for MRP to finish BEGIN 2020-07-01 18:06:16.969636
Switchover: Media recovery is still active
 rmi (PID:3021): Role Change: Canceling MRP - no more redo to apply
2020-07-01T18:06:17.001648-05:00
PR00 (PID:2476): MRP0: Background Media Recovery cancelled with status 16037
2020-07-01T18:06:17.081607-05:00
Errors in file /opt/oracle/diag/rdbms/stbydb/stbydb/trace/stbydb_pr00_2476.trc:
ORA-16037: user requested cancel of managed recovery operation
PR00 (PID:2476): Managed Standby Recovery not using Real Time Apply
Recovery interrupted!
stopping change tracking
2020-07-01T18:06:17.262857-05:00
Errors in file /opt/oracle/diag/rdbms/stbydb/stbydb/trace/stbydb_pr00_2476.trc:
ORA-16037: user requested cancel of managed recovery operation
2020-07-01T18:06:18.389417-05:00
Background Media Recovery process shutdown (stbydb)
2020-07-01T18:06:19.002445-05:00
 rmi (PID:3021): Role Change: Canceled MRP
TMI: kcv_commit_to_so_to_primary wait for MRP to finish END 2020-07-01 18:06:19.002622
TMI: kcv_commit_to_so_to_primary Switchover from physical BEGIN 2020-07-01 18:06:19.002859
 rmi (PID:3021): Killing 2 processes (PIDS:3012,2490) (all RFS) in order to disallow current and future RFS connections. Requested by OS process 3021
2020-07-01T18:06:19.046735-05:00
Process termination requested for pid 3012 [source = rdbms], [info = 2] [request issued by pid: 3021, uid: 54321]
2020-07-01T18:06:19.046739-05:00
Process termination requested for pid 2490 [source = rdbms], [info = 2] [request issued by pid: 3021, uid: 54321]
2020-07-01T18:06:21.144654-05:00
Backup controlfile written to trace file /opt/oracle/diag/rdbms/stbydb/stbydb/trace/stbydb_rmi_3021.trc
SwitchOver after complete recovery through change 3088392
 rmi (PID:3021): ORL pre-clearing operation disabled by switchover
Online log /oradata/cdbs/STBYDB/redo01.log: Thread 1 Group 1 was previously cleared
Online log /oradata/cdbs/STBYDB/redo02.log: Thread 1 Group 2 was previously cleared
Online log /oradata/cdbs/STBYDB/redo03.log: Thread 1 Group 3 was previously cleared
Standby became primary SCN: 3088390
 rmi (PID:3021): RT: Role transition work is not done
 rmi (PID:3021): The Time Management Interface (TMI) is being enabled for role transition
 rmi (PID:3021): information.  This will result in messages beingoutput to the alert log
 rmi (PID:3021): file with the prefix 'TMI: '.  This is being enabled to make the timing of
 rmi (PID:3021): the various stages of the role transition available for diagnostic purposes.
 rmi (PID:3021): This output will end when the role transition is complete.
 rmi (PID:3021): Redo network throttle feature is disabled at mount time
2020-07-01T18:06:21.356789-05:00
 rmi (PID:3021): Database role cleared from PHYSICAL STANDBY [kcvs.c:1030]
Switchover: Complete - Database mounted as primary
TMI: kcv_commit_to_so_to_primary Switchover from physical END 2020-07-01 18:06:21.357243
SWITCHOVER: completed request from primary database.

Abrir la Standby como Primaria

Una vez que el proceso de switchover haya terminado necesitamos abrir la anterior instancia Standby como Primaria, también abriendo la base de datos pluggable:

[oracle@patodgstby ~]$ sqlplus / as sysdba

Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.3.0.0.0

SQL> alter database open;

Database altered.

SQL> alter pluggable database taller open ;

Pluggable database altered.

SQL> show pdbs ;

    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         2 PDB$SEED                       READ ONLY  NO
         3 TALLER                         READ WRITE NO

Nuesta base de datos pluggable está abierta y lista para usarse en este lado.

Iniciar la Recuperación en la Primaria

Finalmente en la Primaria anterior montamos la instancia e iniciamos la recuperación para mantenerla actualizada:

[oracle@patodgprmy ~]$ sqlplus / as sysdba

SQL*Plus: Release 19.0.0.0.0 - Production on Wed Jul 1 18:09:12 2020
Version 19.3.0.0.0

Copyright (c) 1982, 2019, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup mount
ORACLE instance started.

Total System Global Area 1610609888 bytes
Fixed Size                  9135328 bytes
Variable Size             989855744 bytes
Database Buffers          603979776 bytes
Redo Buffers                7639040 bytes
Database mounted.
SQL> alter database recover managed standby database disconnect from session;

Database altered.

Validar los Nuevos Roles de las Bases de Datos

Validamos que el estado actual de la base de datos Primaria original es ahora Standby física:

[oracle@patodgprmy ~]$ sqlplus / as sysdba

SQL> select name,open_mode,database_role,switchover_status from v$database ;

NAME      OPEN_MODE            DATABASE_ROLE    SWITCHOVER_STATUS
--------- -------------------- ---------------- --------------------
PRMYDB    MOUNTED              PHYSICAL STANDBY NOT ALLOWED

y podemos ver que el estado actual de la base de datos Standby original es ahora Primaria:

[oracle@patodgstby ~]$ sqlplus / as sysdba

SQL> select name, open_mode, database_role, switchover_status from v$database ;

NAME      OPEN_MODE            DATABASE_ROLE    SWITCHOVER_STATUS
--------- -------------------- ---------------- --------------------
PRMYDB    READ WRITE           PRIMARY          TO STANDBY

Hemos realizado el switchover, nuestra base de datos Primaria tiene ahora un rol de Standby, y la base de datos Standby tiene ahora el rol de Primaria.

Switchback

Para regresar a los roles originales de DataGuard sólo necesitamos realizar los mismos pasos pero apuntando el comando de switchover hacia la instancia Primaria original:

Switchover en la Standby Previa

[oracle@patodgstby ~]$ sqlplus / as sysdba

Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.3.0.0.0

SQL> alter database switchover to PRMYDB verify;

Database altered.

SQL> alter database switchover to PRMYDB;

Database altered.

SQL> exit
Disconnected from Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.3.0.0.0

[oracle@patodgstby ~]$

Abrir la Base de Datos Primaria

[oracle@patodgprmy ~]$ sqlplus / as sysdba

Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.3.0.0.0

SQL> alter database open;

Database altered.

SQL> show pdbs

    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         2 PDB$SEED                       READ ONLY  NO
         3 TALLER                         READ WRITE NO
SQL>

Iniciar la Recuperación en la Standby

[oracle@patodgstby ~]$ sqlplus / as sysdba

SQL*Plus: Release 19.0.0.0.0 - Production on Wed Jul 1 18:48:43 2020
Version 19.3.0.0.0

Copyright (c) 1982, 2019, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup mount
ORACLE instance started.

Total System Global Area 1610609888 bytes
Fixed Size                  9135328 bytes
Variable Size             956301312 bytes
Database Buffers          637534208 bytes
Redo Buffers                7639040 bytes

SQL> alter database recover managed standby database disconnect from session;

Database altered.

SQL>

Validación de Datos

Para validar que ningún dato se perdió por realizar el switchover y el switchback, insertamos un renglón en una tabla de prueba en cada paso identificando el hostname origen de los datos.
Aquí podemos ver la salida mostrando que sin importar en dónde esté la Primaria, los datos son actualizados y replicados hacia la otra instancia:

SQL> select * from tdgso ;

        I1 C1                   C2           D1
---------- -------------------- ------------ ---------------------------------------------------------------------------
         1 Before Switchover    patodgprmy   01-JUL-20 05.48.28.945090 PM
         2 After Switchover     patodgstby   01-JUL-20 06.12.24.804569 PM
         3 Switchback           patodgprmy   01-JUL-20 06.55.16.232940 PM

Conclusión

Hemos realizado exitosamente un switchover y un switchback en nuestro ambiente DataGuard en Oracle 19c multitenant, revisando los pasos internos en el registro de alertas y también validando que los datos de nuestra base de datos pluggable se mantienen actualizados sin importar dónde se encuentra la instancia Primaria.