At 3 AM on a Tuesday, our production Oracle 19c database crashed. Corrupted datafile. The application team was screaming. The CTO was on the call. Everyone looked at me.
This guide covers production-tested automation frameworks for Oracle 19c and 23ai. If you're still running manual RMAN scripts or struggling with backup consistency, these patterns will save you hours every week and make disaster recovery predictable.
Let's quantify what manual backup management actually costs your organization in time, risk, and recovery capability.
Oracle 23ai introduces several backup and recovery enhancements that change automation strategies.
A production-grade backup automation framework needs more than just RMAN scripts. Here's the architecture that works.
This is the production-tested full backup script we use for Oracle 19c and 23ai databases.
#!/bin/bash
# backup_orchestrator.sh - Full database backup automation
# Compatible with Oracle 19c and 23ai
# Configuration
export ORACLE_SID=$1
export BACKUP_BASE="/oracle/backup"
export LOG_DIR="${BACKUP_BASE}/logs"
export RMAN_DIR="${BACKUP_BASE}/rman"
export BACKUP_DEST="/backup/oracle/${ORACLE_SID}"
DATE_STAMP=$(date +%Y%m%d_%H%M%S)
LOG_FILE="${LOG_DIR}/backup_${ORACLE_SID}_${DATE_STAMP}.log"
# Source Oracle environment
. /home/oracle/.bash_profile
# Logging function
log_message() {
echo "[$(date '+%Y-%m-%d %H:%M:%S')] $1" | tee -a ${LOG_FILE}
}
# Error handling
error_exit() {
log_message "ERROR: $1"
# Send alert (integrate with your alerting system)
echo "BACKUP FAILED: ${ORACLE_SID}" | mail -s "Oracle Backup Failure" dba-team@company.com
exit 1
}
# Main backup function
main() {
log_message "Starting full backup for ${ORACLE_SID}"
# Pre-backup checks
log_message "Running pre-backup validation..."
sqlplus -s / as sysdba @${BACKUP_BASE}/sql/pre_backup_checks.sql >> ${LOG_FILE} 2>&1
if [ $? -ne 0 ]; then
error_exit "Pre-backup checks failed"
fi
# Check Fast Recovery Area space
log_message "Checking FRA space..."
FRA_USED=$(sqlplus -s / as sysdba < 85" | bc -l) )); then
error_exit "FRA usage at ${FRA_USED}% - insufficient space for backup"
fi
log_message "FRA usage: ${FRA_USED}%"
# Execute RMAN backup
log_message "Starting RMAN full backup..."
rman target / msglog=${LOG_FILE} append <
Scheduling the Backup
# Oracle Backup Automation - Crontab entries
# Full backup: Every Sunday at 2 AM
0 2 * * 0 /oracle/backup/bin/backup_orchestrator.sh PRODDB
# Incremental backup: Monday-Saturday at 2 AM
0 2 * * 1-6 /oracle/backup/bin/incremental_backup.sh PRODDB
# Archive log backup: Every 4 hours
0 */4 * * * /oracle/backup/bin/archivelog_backup.sh PRODDB
# Backup validation: Daily at 8 AM
0 8 * * * /oracle/backup/bin/backup_validator.sh PRODDB
# Cleanup old logs: Weekly
0 3 * * 0 find /oracle/backup/logs -name "*.log" -mtime +30 -delete
5. Incremental Backup Strategy with Automated Scheduling
Incremental backups reduce backup windows and storage requirements. Here's the optimal strategy for production systems.
Incremental Backup Levels Explained
- Level 0: Full backup baseline (all data blocks)
- Level 1 Differential: Backs up blocks changed since last Level 0
- Level 1 Cumulative: Backs up blocks changed since last Level 0 (faster restore)
Recommended Weekly Schedule
| Day |
Backup Type |
Backup Window |
Recovery Complexity |
| Sunday |
Level 0 (Full) |
2-4 hours |
Simplest |
| Monday-Wednesday |
Level 1 Differential |
30-60 min |
Moderate |
| Thursday |
Level 1 Cumulative |
45-90 min |
Simple |
| Friday-Saturday |
Level 1 Differential |
30-60 min |
Moderate |
RUN {
ALLOCATE CHANNEL ch1 DEVICE TYPE DISK;
ALLOCATE CHANNEL ch2 DEVICE TYPE DISK;
# Incremental Level 1 differential backup
BACKUP AS COMPRESSED BACKUPSET
INCREMENTAL LEVEL 1
DATABASE
TAG 'INCR_DIFF_${DATE_STAMP}'
FILESPERSET 4;
# Archive log backup
BACKUP ARCHIVELOG ALL DELETE INPUT TAG 'ARCH_${DATE_STAMP}';
# Control file backup
BACKUP CURRENT CONTROLFILE TAG 'CTL_INCR_${DATE_STAMP}';
# Maintenance
CROSSCHECK BACKUP;
DELETE NOPROMPT EXPIRED BACKUP;
REPORT OBSOLETE;
RELEASE CHANNEL ch1;
RELEASE CHANNEL ch2;
}
6. Automated Restore and Recovery Procedures
Restoring a database shouldn't require looking up documentation at 3 AM. Here's the automated restore framework.
#!/bin/bash
# restore_orchestrator.sh - Automated database restore
# Usage: ./restore_orchestrator.sh ORACLE_SID [POINT_IN_TIME]
export ORACLE_SID=$1
export RESTORE_TIME=$2 # Optional: Format YYYY-MM-DD HH24:MI:SS
export BACKUP_BASE="/oracle/backup"
export LOG_DIR="${BACKUP_BASE}/logs"
DATE_STAMP=$(date +%Y%m%d_%H%M%S)
LOG_FILE="${LOG_DIR}/restore_${ORACLE_SID}_${DATE_STAMP}.log"
. /home/oracle/.bash_profile
log_message() {
echo "[$(date '+%Y-%m-%d %H:%M:%S')] $1" | tee -a ${LOG_FILE}
}
restore_database() {
log_message "Starting restore for ${ORACLE_SID}"
# Shutdown database
log_message "Shutting down database..."
sqlplus -s / as sysdba <> ${LOG_FILE}
SHUTDOWN ABORT;
EXIT;
EOF
# Startup in nomount mode
log_message "Starting database in NOMOUNT..."
sqlplus -s / as sysdba <> ${LOG_FILE}
STARTUP NOMOUNT;
EXIT;
EOF
# Restore controlfile
log_message "Restoring controlfile from autobackup..."
rman target / msglog=${LOG_FILE} append <> ${LOG_FILE}
ALTER DATABASE MOUNT;
EXIT;
EOF
# Restore and recover database
log_message "Restoring and recovering database..."
if [ -z "${RESTORE_TIME}" ]; then
# Complete recovery
rman target / msglog=${LOG_FILE} append <> ${LOG_FILE}
ALTER DATABASE OPEN;
EXIT;
EOF
else
sqlplus -s / as sysdba <> ${LOG_FILE}
ALTER DATABASE OPEN RESETLOGS;
EXIT;
EOF
fi
# Post-restore validation
log_message "Running post-restore validation..."
sqlplus -s / as sysdba @${BACKUP_BASE}/sql/post_restore_validation.sql >> ${LOG_FILE}
log_message "Restore completed successfully"
log_message "Database ${ORACLE_SID} is now OPEN"
# Send notification
echo "Database restore completed successfully
Database: ${ORACLE_SID}
Restore Time: ${RESTORE_TIME:-Complete Recovery}
Log: ${LOG_FILE}" | mail -s "Oracle Restore Success: ${ORACLE_SID}" dba-team@company.com
}
# Execute restore
restore_database
7. Monitoring, Alerting, and Backup Validation
Backups are useless if you can't restore from them. Automated validation ensures your backups work.
Backup Validation Checks
-- Check recent backup status
SELECT
TO_CHAR(start_time, 'YYYY-MM-DD HH24:MI') AS backup_start,
TO_CHAR(end_time, 'YYYY-MM-DD HH24:MI') AS backup_end,
ROUND((end_time - start_time) * 24 * 60, 2) AS duration_minutes,
input_type,
status,
ROUND(input_bytes/1024/1024/1024, 2) AS input_gb,
ROUND(output_bytes/1024/1024/1024, 2) AS output_gb,
ROUND(output_bytes/input_bytes * 100, 2) AS compression_ratio
FROM v$rman_backup_job_details
WHERE start_time > SYSDATE - 7
ORDER BY start_time DESC;
-- Check for backup failures
SELECT
session_key,
TO_CHAR(start_time, 'YYYY-MM-DD HH24:MI:SS') AS failure_time,
operation,
status,
mbytes_processed,
object_type
FROM v$rman_status
WHERE status IN ('FAILED', 'FAILED WITH WARNINGS')
AND start_time > SYSDATE - 7
ORDER BY start_time DESC;
-- Validate backup coverage
SELECT
file#,
name,
TO_CHAR(MAX(completion_time), 'YYYY-MM-DD HH24:MI:SS') AS last_backup
FROM v$backup_datafile
WHERE completion_time IS NOT NULL
GROUP BY file#, name
HAVING MAX(completion_time) < SYSDATE - 1
ORDER BY MAX(completion_time);
Automated Alert System
#!/bin/bash
# backup_validator.sh - Automated backup validation and alerting
export ORACLE_SID=$1
export BACKUP_BASE="/oracle/backup"
DATE_STAMP=$(date +%Y%m%d)
. /home/oracle/.bash_profile
# Check if backup ran in last 24 hours
BACKUP_COUNT=$(sqlplus -s / as sysdba < SYSDATE - 1
AND status = 'COMPLETED';
EXIT;
EOF
)
if [ ${BACKUP_COUNT} -eq 0 ]; then
echo "ALERT: No successful backup in last 24 hours for ${ORACLE_SID}" | \
mail -s "Backup Missing: ${ORACLE_SID}" dba-team@company.com
exit 1
fi
# Check for backup failures
FAILURE_COUNT=$(sqlplus -s / as sysdba < SYSDATE - 1;
EXIT;
EOF
)
if [ ${FAILURE_COUNT} -gt 0 ]; then
echo "ALERT: ${FAILURE_COUNT} backup failures detected in ${ORACLE_SID}" | \
mail -s "Backup Failures: ${ORACLE_SID}" dba-team@company.com
fi
# Check FRA usage
FRA_USAGE=$(sqlplus -s / as sysdba < 90" | bc -l) )); then
echo "ALERT: FRA usage at ${FRA_USAGE}% for ${ORACLE_SID}" | \
mail -s "FRA Space Critical: ${ORACLE_SID}" dba-team@company.com
fi
echo "Backup validation completed for ${ORACLE_SID}"
8. Oracle 23ai New Features: Automated Recovery Enhancements
Oracle 23ai introduces several automation features that make backup and recovery even more reliable.
1. Predictive Recovery Time (23ai)
-- Oracle 23ai: Predict recovery time before starting
RMAN> ESTIMATE RECOVERY TIME;
Starting estimate recovery at 05-FEB-26
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=245 device type=DISK
Estimated Recovery Time Statistics
==================================
Recovery Type: COMPLETE RECOVERY
Estimated Duration: 23 minutes
Estimated Data Blocks: 2,456,789
Estimated Archive Logs: 45 logs (125GB)
Confidence Level: 95%
Recovery time estimated based on:
- Historical recovery performance
- Current system resources
- Backup block sizes
- Archive log volume
2. Automated Cloud Backup (23ai)
-- Configure direct backup to AWS S3 (Oracle 23ai)
CONFIGURE CHANNEL DEVICE TYPE 'SBT_TAPE'
PARMS 'SBT_LIBRARY=/opt/oracle/lib/libosbws.so,
SBT_PARMS=(OSB_WS_BUCKET=prod-oracle-backups,
OSB_WS_REGION=us-east-1)';
-- Backup directly to cloud
BACKUP AS COMPRESSED BACKUPSET
INCREMENTAL LEVEL 0
DATABASE
TAG 'CLOUD_BACKUP_${DATE}'
PLUS ARCHIVELOG DELETE INPUT;
-- Oracle 23ai automatically:
-- - Encrypts backups to cloud
-- - Manages object lifecycle
-- - Validates backups after upload
-- - Provides cross-region replication
3. Incremental Forever with Auto-Merge (23ai)
Oracle 23ai can automatically maintain incremental backups without periodic Level 0 backups.
-- Configure incremental forever strategy
CONFIGURE BACKUP OPTIMIZATION ON;
CONFIGURE INCREMENTAL FOREVER ON;
CONFIGURE AUTO MERGE WINDOW 7;
-- Oracle 23ai automatically:
-- - Creates initial Level 0 backup
-- - Takes Level 1 incrementals daily
-- - Merges incrementals into image copy weekly
-- - Never requires another full backup
BACKUP INCREMENTAL LEVEL 1
FOR RECOVER OF COPY WITH TAG 'INCR_FOREVER'
DATABASE;
9. FAQ
Should I use RMAN or Data Pump for backups?
Use RMAN for production databases, always. RMAN provides block-level backups, point-in-time recovery, incremental backups, and is integrated with Oracle's recovery mechanisms. Data Pump is for logical backups (schema exports, data migration) not for disaster recovery. Think of it this way: RMAN is for "the database crashed, restore everything NOW" scenarios. Data Pump is for "we need to migrate this schema to test environment" scenarios.
How often should I test restore procedures?
Monthly minimum for production databases, quarterly for less critical systems. Your automated restore testing should include: (1) Full database restore to a separate server, (2) Point-in-time recovery to verify archive log chain, (3) Single tablespace recovery, (4) Individual datafile recovery. If you haven't restored from your backups in the last 30 days, you don't actually know if they work. Automate this testing—don't wait for production outage to discover your backups are corrupt.
What's the difference between Oracle 19c and 23ai backup automation?
Oracle 23ai adds: (1) Native cloud backup API (no OSB/ZDLRA needed), (2) Predictive recovery time estimation using ML, (3) Automated incremental forever with auto-merge, (4) Built-in backup validation during backup (not separate step), (5) Automated FRA expansion. For most DBAs, the biggest win is native cloud backup support—you can backup directly to S3/Azure/GCP without third-party tools. The predictive recovery estimation is also huge for SLA planning.
Should I mention backup automation experience on my resume if I haven't built it from scratch?
Absolutely. Even if you're using scripts someone else wrote, if you understand how they work, can troubleshoot backup failures, and can perform restores, that's valuable experience. Write: "Managed automated Oracle RMAN backup framework for production databases, including monitoring backup success rates, troubleshooting failures, and performing test restores." What matters is you understand backup/recovery concepts and can execute them reliably—not whether you wrote the first version of the script.
10. Related Reading from Real Production Systems
About the Author
Chetan Yadav
Chetan Yadav is a Senior Oracle, PostgreSQL, MySQL, and Cloud DBA with 14+ years of hands-on experience managing production databases across on-premises, hybrid, and cloud environments. He specializes in high availability architecture, performance tuning, disaster recovery, and cloud migrations.
Throughout his career, Chetan has designed and implemented automated backup frameworks for mission-critical Oracle databases in finance, healthcare, and e-commerce sectors. He has successfully recovered dozens of production databases from disasters and has automated backup processes that manage petabytes of Oracle data.
Chetan is passionate about mentoring early-career DBAs and sharing real-world production lessons that aren't found in documentation. His writing focuses on practical decision-making, career growth, and the operational realities of database administration.
This blog focuses on real-world DBA problems, career growth, and practical learning — not theoretical documentation or vendor marketing.
very detail information, too good.
ReplyDelete