Showing posts with label Data Guard. Show all posts
Showing posts with label Data Guard. Show all posts

Thursday, November 27, 2025

Stop Guessing: The "One-Shot" Script to Check Oracle Data Guard Health (Lags, Gaps & MRP)

Oracle Data Guard Health Check Script Dashboard Primary Standby



If you are a DBA, you know the panic of a "Quiet Standby." The alerts are silent. The phone isn't ringing. But deep down, you wonder: Is my Disaster Recovery (DR) site actually in sync, or has it been stuck on Sequence #10452 since last Tuesday?

Too many monitoring tools (like OEM or Zabbix) only trigger an alert when the lag hits a threshold (e.g., "Lag > 30 Mins"). By then, it’s often too late. You don't just want to know if there is a lag; you need to know where the lag is.

Is it the Network (Transport Lag)? Or is it the Disk/CPU (Apply Lag)?

Below is the exact script I use in my daily health checks. It consolidates 4 different dynamic performance views (v$dataguard_statsv$managed_standbyv$archive_gapv$database) into one single "Truth" report.

The Script (dg_health_check.sql)

Save this as dg_health_check.sql and run it on your Standby Database.

SQL
SET LINESIZE 200 PAGESIZE 1000 CHECK OFF FEEDBACK OFF ECHO OFF VERIFY OFF
COL name FORMAT a30
COL value FORMAT a20
COL unit FORMAT a30
COL time_computed FORMAT a25
COL process FORMAT a10
COL status FORMAT a15
COL sequence# FORMAT 99999999
COL block# FORMAT 999999
COL error_message FORMAT a50

PROMPT ========================================================
PROMPT  ORACLE DATA GUARD HEALTH CHECK (Run on Standby)
PROMPT ========================================================

PROMPT
PROMPT 1. DATABASE ROLE & PROTECTION MODE
PROMPT ----------------------------------------
SELECT name, db_unique_name, database_role, open_mode, protection_mode 
FROM v$database;

PROMPT
PROMPT 2. REAL-TIME LAG STATISTICS (The Source of Truth)
PROMPT ----------------------------------------
-- Transport Lag = Delay in receiving data (Network Issue)
-- Apply Lag     = Delay in writing data (IO/CPU Issue)
SELECT name, value, unit, time_computed 
FROM v$dataguard_stats 
WHERE name IN ('transport lag', 'apply lag', 'estimated startup time');

PROMPT
PROMPT 3. MRP (MANAGED RECOVERY PROCESS) STATUS
PROMPT ----------------------------------------
-- IF NO ROWS SELECTED: Your recovery is STOPPED.
-- Look for 'APPLYING_LOG' or 'WAIT_FOR_LOG'
SELECT process, status, thread#, sequence#, block# 
FROM v$managed_standby 
WHERE process LIKE 'MRP%';

PROMPT
PROMPT 4. GAP DETECTION
PROMPT ----------------------------------------
-- If rows appear here, you have a missing archive log that FAL_SERVER could not fetch.
SELECT * FROM v$archive_gap;

PROMPT
PROMPT 5. RECENT ERRORS (Last 10 Events)
PROMPT ----------------------------------------
SELECT TO_CHAR(timestamp, 'DD-MON-RR HH24:MI:SS') as err_time, message 
FROM v$dataguard_status 
WHERE severity IN ('Error','Fatal') 
AND timestamp > sysdate-1
ORDER BY timestamp DESC
FETCH FIRST 10 ROWS ONLY;

PROMPT ========================================================
PROMPT  END OF REPORT
PROMPT ========================================================

Oracle Data Guard Transport Lag vs Apply Lag Architecture Diagram


How to Analyze the Output (Like a Senior DBA)

Scenario A: High Transport Lag

  • What you see: Transport Lag is high (e.g., +00 01:20:00), but Apply Lag is low.

  • What it means: Your Primary database is generating Redo faster than your network can ship it.

  • The Fix: Check your network bandwidth. If you are using Oracle 19c or 23ai, consider enabling Redo Compression in your Data Guard broker configuration (EditDatabase Set Property RedoCompression='ENABLE').

Scenario B: High Apply Lag

  • What you see: Transport Lag is near 0, but Apply Lag is climbing (e.g., +00 00:45:00).

  • What it means: The data is there (on the standby server), but the database can't write it to disk fast enough. This often happens during batch loads or index rebuilds on the Primary.

  • The Fix: Check I/O stats on the Standby. Ensure you are using Real-Time Apply so the MRP (Managed Recovery Process) reads directly from Standby Redo Logs (SRLs) rather than waiting for archive logs to be finalized.

Scenario C: MRP Status is "WAIT_FOR_GAP"

  • What you see: In Section 3, the status is WAIT_FOR_GAP.

  • What it means: A severe gap has occurred. The Standby is missing a specific sequence number and cannot proceed until you manually register that file.

  • The Fix: Run the query in Section 4 (v$archive_gap) to identify the missing sequence, restore it from backup, and register it.

Why this works in 2026

Old school scripts relied on v$archived_log, which only tells you history. In modern Oracle Cloud (OCI) and Hybrid environments, v$dataguard_stats is the only view that accurately calculates the time difference between the Primary commit and the Standby visibility.

Chetan Yadav is a Senior Oracle, PostgreSQL, MySQL & Cloud DBA with 14+ years of experience supporting high-traffic production environments across AWS, Azure, and on-premise systems. His core expertise includes Oracle RAC, ASM, Data Guard, performance tuning, HA/DR design, monitoring frameworks, and real-world troubleshooting.

He also trains DBAs globally through deep-dive technical content, hands-on sessions, and automation workflows using n8n, AI tools, and modern monitoring stacks. His mission is to help DBAs solve real production problems and grow into high-paying remote roles worldwide.

Chetan regularly publishes expert content across Oracle, PostgreSQL, MySQL, and Cloud DBA technologies—including performance tuning guides, DR architectures, monitoring tools, scripts, and real incident-based case studies.


Explore More Technical Work

LinkedIn (Professional Profile & Articles)
https://www.linkedin.com/in/chetanyadavvds/

YouTube – Oracle Foundations Playlist
https://www.youtube.com/playlist?list=PL5TN6ECUWGROHQGXep_5hff-2ageWTp4b

Telegram – LevelUp_Careers DBA Tips
https://t.me/LevelUp_Careers

Instagram – Oracle/Cloud Learning Reels
https://www.instagram.com/levelup_careers/

Facebook Page – OracleDBAInfo
https://www.facebook.com/OracleDBAInfo

These platforms feature guides, scripts, diagrams, troubleshooting workflows, and real-world DBA case studies designed for database professionals worldwide.

Wednesday, April 26, 2023

Step-by-step guide with commands for configuring Oracle 19c Data Guard Physical Standby



Step 1: Install Oracle 19c software on both primary and standby servers.

  • Download the Oracle Database software for Oracle 19c from the Oracle website.
  • Follow the installation instructions for your operating system to install the Oracle Database software on both the primary and standby servers.

Step 2: Create a standby database on the standby server.

  • On the standby server, create an empty Oracle database using the Database Configuration Assistant (DBCA) or manually with SQL scripts.
  • Example SQL script to create a standby database:
sql
CREATE DATABASE <database_name> USER SYS IDENTIFIED BY <sys_password> USER SYSTEM IDENTIFIED BY <system_password> LOGFILE GROUP 1 ('<log_file_directory>/redo01.log') SIZE 100M, GROUP 2 ('<log_file_directory>/redo02.log') SIZE 100M, GROUP 3 ('<log_file_directory>/redo03.log') SIZE 100M;

Step 3: Enable archivelog mode on the primary database.

  • Connect to the primary database with a privileged user.
  • Enable archivelog mode using the following SQL command:
sql
ALTER DATABASE ARCHIVELOG;

Step 4: Configure log shipping on the primary database.

  • Connect to the primary database with a privileged user.
  • Configure log shipping to automatically transfer archived redo logs to the standby server using the following SQL command:
sql
ALTER SYSTEM SET LOG_ARCHIVE_DEST_2='<standby_server_directory>' SCOPE=BOTH; ALTER SYSTEM SET LOG_ARCHIVE_DEST_STATE_2=ENABLE SCOPE=BOTH;

Step 5: Create standby redo logs on the standby database.

  • Connect to the standby database with a privileged user.
  • Create standby redo logs using the following SQL command:
sql
ALTER DATABASE ADD STANDBY LOGFILE GROUP 4 ('<standby_redo_log_directory>/standby_redo01.log') SIZE 100M, GROUP 5 ('<standby_redo_log_directory>/standby_redo02.log') SIZE 100M, GROUP 6 ('<standby_redo_log_directory>/standby_redo03.log') SIZE 100M;

Step 6: Configure Data Guard properties on the primary database.

  • Connect to the primary database with a privileged user.
  • Configure Data Guard properties using the following SQL command:
sql
ALTER SYSTEM SET DB_UNIQUE_NAME='<primary_db_unique_name>' SCOPE=BOTH; ALTER SYSTEM SET LOG_ARCHIVE_DEST_2='<standby_server_directory>' SCOPE=BOTH; ALTER SYSTEM SET LOG_ARCHIVE_CONFIG='DG_CONFIG=(<primary_db_unique_name>,<standby_db_unique_name>)' SCOPE=BOTH;

Step 7: Start redo log apply on the standby database.

  • Connect to the standby database with a privileged user.
  • Start redo log apply using the following SQL command:
sql
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT;

Step 8: Monitor Data Guard configuration.

  • Monitor the Data Guard configuration using Oracle Database utilities, such as Data Guard Broker, Enterprise Manager, or SQL queries, to check the status and performance of the primary and standby databases.

Step 9: Test failover and switchover scenarios.

  • Perform failover and switchover scenarios to test the Data Guard configuration and ensure its reliability, following Oracle's documentation and best practices.

That's it! You have successfully configured Oracle 19c Data Guard Physical Standby using the above commands. Remember to customize the commands according to your specific environment and requirements, and refer to Oracle's documentation for detailed guidance.



Looking forward to your suggestions.. please feel free to write to me.. Your help would definitely help to improve our blog. Chetan Yadav