Why Data Guard Lag Happens in Production: Sync, I/O and Network Deep Dive
Oracle Database: 19.18.0.0.0 Enterprise Edition • Primary: 2-Node RAC, 4.8 TB OLTP • Standby: Physical Standby (Active Data Guard)
Protection Mode: Maximum Availability (SYNC/AFFIRM) • Network: Dedicated 1 GbE WAN, 120 km, RTT 1.8 ms
Peak Load: 2,800 TPS, 180 MB/sec redo generation • Application: Core banking transaction processing
The monitoring alert fires at 11:43 PM: "Data Guard apply lag exceeds 900 seconds." Transport lag is 180 seconds. Apply lag is 900 seconds. The standby is 15 minutes behind the primary. If the primary fails right now, 15 minutes of financial transactions are at risk.
This scenario happens in production Data Guard environments more often than most teams admit. The problem looks the same from the outside every time, but the root cause is completely different each time. Transport lag and apply lag each have different causes, different diagnostic queries, and different fixes. Treating them as the same problem wastes hours of investigation.
This guide covers all six real production causes of Data Guard lag, the exact SQL to identify each one, and the specific fix for each. No guesswork. Precise diagnosis first, then precise resolution.