Automating Backup and Restore in Oracle 19c and 23ai: Complete DBA Guide
Automating Backup and Restore in Oracle 19c and 23ai: Complete DBA Guide
Production-Tested RMAN Automation Scripts and Recovery Strategies
📅 February 05, 2026
👤 Chetan Yadav - Senior Oracle & Cloud DBA
⏱️ 18-20 min read
⏱️ Estimated Reading Time: 18–20 minutes
💾 Oracle RMAN Automation - From Manual Backups to Fully Automated Recovery
At 3 AM on a Tuesday, our production Oracle 19c database crashed. Corrupted datafile. The application team was screaming. The CTO was on the call. Everyone looked at me.
I typed one command: ./restore_prod.sh PRODDB 2026-02-04_23:00. Twenty-three minutes later, the database was back online with zero data loss. The automated backup and restore framework I'd built six months earlier just saved our jobs.
Manual backup and restore processes are where Oracle DBAs lose the most time. Automating RMAN backups isn't about convenience—it's about reliability, consistency, and being able to restore in minutes instead of hours when production is down.
This guide covers production-tested automation frameworks for Oracle 19c and 23ai. If you're still running manual RMAN scripts or struggling with backup consistency, these patterns will save you hours every week and make disaster recovery predictable.
Six months ago, our CTO announced we were moving to a "multi-cloud strategy." The board wanted vendor independence. Engineering wanted best-of-breed services. Finance wanted competitive pricing.
I was the DBA who had to make it actually work.
Designing multi-cloud database architecture isn't about running the same database everywhere. It's about knowing which workload belongs on which platform, how to synchronize data across clouds, and when vendor lock-in is actually acceptable.
This guide covers reference architectures from production systems running across Oracle on-premises, AWS RDS, Azure SQL Managed Instance, and GCP AlloyDB. If you're a DBA planning or managing multi-cloud databases, these are the patterns that work.
What Works, What Doesn't, and What Performance Surprises to Expect
📅 February 03, 2026
👤 Chetan Yadav - Senior Oracle & Cloud DBA
⏱️ 18-20 min read
⏱️ Estimated Reading Time: 18–20 minutes
🚀 GCP AlloyDB vs Oracle 19c - Performance Tuning from a Veteran DBA's Perspective
Three months after our team moved a critical Oracle 19c OLTP system to GCP AlloyDB, the CFO asked in the quarterly review, "Are we seeing the 4x performance improvement Google promised?"
I had the AWR reports from Oracle and the performance metrics from AlloyDB. Same queries, same data volume, same transaction load. The answer was more nuanced than a simple yes or no.
Some queries ran 6x faster in AlloyDB. Others ran slower. The difference wasn't the database—it was understanding which PostgreSQL-based optimizations AlloyDB uses and how they compare to Oracle's approach.
This guide is what I learned comparing Oracle 19c performance tuning to GCP AlloyDB in production. If you're an Oracle DBA evaluating AlloyDB, this covers the performance differences that actually matter.
Two weeks before our planned Azure migration cutover, the project manager asked me, "We've tested the migration three times in dev. What could possibly go wrong in production?"
I showed him my checklist. Sixty-three items. He thought I was being overcautious. Until we hit production and discovered our application connection strings were hardcoded with Oracle-specific syntax that Azure SQL Managed Instance didn't support. The migration succeeded, but the application couldn't connect.
That's when he understood: migration isn't just moving data. It's validating every assumption about how your application interacts with the database.
This checklist represents lessons from five Azure Managed Instance migrations, including two that required emergency rollbacks. If you're an Oracle DBA planning an Azure migration, this is the checklist I wish I had on my first project.
If you are a DBA, you know the panic of a "Quiet Standby." The alerts are silent. The phone isn't ringing. But deep down, you wonder: Is my Disaster Recovery (DR) site actually in sync, or has it been stuck on Sequence #10452 since last Tuesday?
Too many monitoring tools (like OEM or Zabbix) only trigger an alert when the lag hits a threshold (e.g., "Lag > 30 Mins"). By then, it’s often too late. You don't just want to know if there is a lag; you need to know where the lag is.
Is it the Network (Transport Lag)? Or is it the Disk/CPU (Apply Lag)?
Below is the exact script I use in my daily health checks. It consolidates 4 different dynamic performance views (v$dataguard_stats, v$managed_standby, v$archive_gap, v$database) into one single "Truth" report.
The Script (dg_health_check.sql)
Save this as dg_health_check.sql and run it on your Standby Database.
SQL
SET LINESIZE 200 PAGESIZE 1000CHECK OFF FEEDBACK OFF ECHO OFF VERIFY OFF
COL name FORMAT a30
COL value FORMAT a20
COL unit FORMAT a30
COL time_computed FORMAT a25
COL process FORMAT a10
COL status FORMAT a15
COL sequence# FORMAT 99999999
COL block# FORMAT 999999
COL error_message FORMAT a50
PROMPT ========================================================
PROMPT ORACLE DATA GUARD HEALTH CHECK (Run on Standby)
PROMPT ========================================================
PROMPT
PROMPT 1. DATABASE ROLE & PROTECTION MODE
PROMPT ----------------------------------------SELECT name, db_unique_name, database_role, open_mode, protection_mode
FROM v$database;
PROMPT
PROMPT 2.REAL-TIME LAG STATISTICS (The Source of Truth)
PROMPT ------------------------------------------ Transport Lag = Delay in receiving data (Network Issue)-- Apply Lag = Delay in writing data (IO/CPU Issue)SELECT name, value, unit, time_computed
FROM v$dataguard_stats
WHERE name IN ('transport lag', 'apply lag', 'estimated startup time');
PROMPT
PROMPT 3. MRP (MANAGED RECOVERY PROCESS) STATUS
PROMPT ------------------------------------------ IF NO ROWS SELECTED: Your recovery is STOPPED.-- Look for 'APPLYING_LOG' or 'WAIT_FOR_LOG'SELECT process, status, thread#, sequence#, block#
FROM v$managed_standby
WHERE process LIKE'MRP%';
PROMPT
PROMPT 4. GAP DETECTION
PROMPT ------------------------------------------ If rows appear here, you have a missing archive log that FAL_SERVER could not fetch.SELECT*FROM v$archive_gap;
PROMPT
PROMPT 5. RECENT ERRORS (Last10 Events)
PROMPT ----------------------------------------SELECT TO_CHAR(timestamp, 'DD-MON-RR HH24:MI:SS') as err_time, message
FROM v$dataguard_status
WHERE severity IN ('Error','Fatal')
ANDtimestamp> sysdate-1ORDERBYtimestampDESCFETCHFIRST10ROWSONLY;
PROMPT ========================================================
PROMPT ENDOF REPORT
PROMPT ========================================================
How to Analyze the Output (Like a Senior DBA)
Scenario A: High Transport Lag
What you see:Transport Lag is high (e.g., +00 01:20:00), but Apply Lag is low.
What it means: Your Primary database is generating Redo faster than your network can ship it.
The Fix: Check your network bandwidth. If you are using Oracle 19c or 23ai, consider enabling Redo Compression in your Data Guard broker configuration (EditDatabase Set Property RedoCompression='ENABLE').
Scenario B: High Apply Lag
What you see:Transport Lag is near 0, but Apply Lag is climbing (e.g., +00 00:45:00).
What it means: The data is there (on the standby server), but the database can't write it to disk fast enough. This often happens during batch loads or index rebuilds on the Primary.
The Fix: Check I/O stats on the Standby. Ensure you are using Real-Time Apply so the MRP (Managed Recovery Process) reads directly from Standby Redo Logs (SRLs) rather than waiting for archive logs to be finalized.
Scenario C: MRP Status is "WAIT_FOR_GAP"
What you see: In Section 3, the status is WAIT_FOR_GAP.
What it means: A severe gap has occurred. The Standby is missing a specific sequence number and cannot proceed until you manually register that file.
The Fix: Run the query in Section 4 (v$archive_gap) to identify the missing sequence, restore it from backup, and register it.
Why this works in 2026
Old school scripts relied on v$archived_log, which only tells you history. In modern Oracle Cloud (OCI) and Hybrid environments, v$dataguard_stats is the only view that accurately calculates the time difference between the Primary commit and the Standby visibility.
Chetan Yadav is a Senior Oracle, PostgreSQL, MySQL & Cloud DBA with 14+ years of experience supporting high-traffic production environments across AWS, Azure, and on-premise systems. His core expertise includes Oracle RAC, ASM, Data Guard, performance tuning, HA/DR design, monitoring frameworks, and real-world troubleshooting.
He also trains DBAs globally through deep-dive technical content, hands-on sessions, and automation workflows using n8n, AI tools, and modern monitoring stacks. His mission is to help DBAs solve real production problems and grow into high-paying remote roles worldwide.
Chetan regularly publishes expert content across Oracle, PostgreSQL, MySQL, and Cloud DBA technologies—including performance tuning guides, DR architectures, monitoring tools, scripts, and real incident-based case studies.
These platforms feature guides, scripts, diagrams, troubleshooting workflows, and real-world DBA case studies designed for database professionals worldwide.
How I Use ChatGPT and Automation to Save 3 Hours a Day as a DBA
The New Reality of Database Administration
Database environments today are more dynamic than ever. A DBA manages hybrid and multi-cloud systems across Oracle, PostgreSQL, Aurora MySQL, and other platforms.
While architecture complexity keeps growing, the number of hours in a day does not. Much of a DBA’s time still goes into manual analysis, log checks, and repetitive reporting.
To reclaim that time, I built a workflow using ChatGPT for analysis and n8n for automation. Together they now handle much of the repetitive monitoring and documentation work that used to slow me down.
Step 1: Using ChatGPT as an Analytical Assistant
I use ChatGPT as an intelligent interpreter for the technical data I already collect.
SQL and AWR Analysis
Prompt example:
Analyze this SQL execution plan. Identify expensive operations, missing indexes, and filter or join inefficiencies.
ChatGPT highlights cost-heavy steps, missing statistics, and joins that need review. I then validate insights using DBMS_XPLAN.DISPLAY_CURSOR before making any changes.
Incident Summaries and RCA Drafts
Prompt example:
Summarize the top waits and likely root causes from this AWR report in concise technical language for a status email.
This produces a clean summary that I can send to teams without spending time on formatting or rewriting.
When IO latency crosses a set threshold, the summary reads:
IO wait time on the primary database instance exceeded 60 percent. Possible cause: concurrent updates or storage contention. Review session activity and storage throughput.
Each alert is logged automatically in Google Sheets for trend analysis, so I no longer need to export or merge reports manually.
Step 3: The Measured Impact
After a few weeks, the results were visible:
Around 3 hours of manual effort are saved daily.
Faster communication through structured alerts.
Fewer repetitive RCA summaries.
More focus on architecture, tuning, and mentoring.
This combination of ChatGPT and n8n now runs quietly in the background, reducing operational overhead and improving accuracy.
Key Takeaways
Automation does not replace DBAs; it amplifies their impact.
ChatGPT brings analytical speed and structured communication.
n8n enables event-driven automation that scales without complexity.
If you’re managing complex environments, start with one task — maybe your daily health check or backup report — and automate it. Small steps quickly add up to big efficiency gains.
Final Thought
The next phase of database administration belongs to professionals who merge technical expertise with intelligent automation.
Instead of reacting to alerts, we should design systems that interpret themselves.
Start small, validate your results, and let automation do the routine work so you can focus on engineering.
Where I Share More
If you want to explore DBA automation, Oracle training, or real-world case studies, follow my work here: