Why Data Guard Lag Happens in Production: Sync, I/O and Network Deep Dive
Oracle Database: 19.18.0.0.0 Enterprise Edition • Primary: 2-Node RAC, 4.8 TB OLTP • Standby: Physical Standby (Active Data Guard)
Protection Mode: Maximum Availability (SYNC/AFFIRM) • Network: Dedicated 1 GbE WAN (120 km distance, RTT 1.8 ms)
Peak Load: 2,800 TPS, 180 MB/sec redo generation • Application: Core banking transaction processing
The alert arrives at 11:43 PM: "Data Guard apply lag exceeds 900 seconds." The DBA on call opens the monitoring dashboard. Transport lag is 180 seconds. Apply lag is 900 seconds. The standby is 15 minutes behind the primary. If the primary fails right now, 15 minutes of financial transactions could be at risk.
This scenario plays out in production Data Guard environments more often than most teams admit. Lag is not a single problem , it is six different problems that look identical from the outside. Transport lag and apply lag each have completely different root causes, different diagnostic queries, and completely different fixes. Treating them the same wastes hours of investigation time.
This guide covers every real cause of Data Guard lag I have diagnosed in production, the exact SQL to prove which one you are dealing with, and the specific fix for each. No guesswork. No generic advice about "check your network." Precise diagnosis first, then precise resolution.