Showing posts with label Oracle 23ai. Show all posts
Showing posts with label Oracle 23ai. Show all posts

Monday, March 2, 2026

Oracle Performance Engineering Guide: AWR, ASH and SQL Monitor in 19c and 23ai

Oracle Performance Engineering Guide: AWR, ASH and SQL Monitor in 19c and 23ai

Oracle Performance Engineering Guide: AWR, ASH and SQL Monitor in Oracle 19c and 23ai

Real Production Troubleshooting Using Oracle Performance Diagnostic Tools
 March 2026
 Chetan Yadav — Senior Oracle and Cloud DBA
⏱️ 12–14 min read
⏱️ Estimated Reading Time: 12–14 minutes
Master AWR, ASH, and SQL Monitor — The Oracle DBA’s Performance Troubleshooting Trinity
Environment Used in This Example

The troubleshooting approach described in this article reflects common production environments such as:

Oracle Database 19c  •  2-Node RAC Cluster  •  8 TB OLTP Database  •  ~2,500 Concurrent Sessions  •  Peak ~1,500 TPS

While the SQL queries shown are simplified for educational purposes, the investigation methodology mirrors real production troubleshooting practices used by experienced DBAs.
Database performance monitoring dashboard

During a recent performance investigation, an OLTP database began experiencing significant latency during peak traffic. Average response time increased from roughly 300 milliseconds to more than 5 seconds.

Application teams initially suspected CPU saturation. Infrastructure teams suspected storage latency. Situations like this are common during performance incidents where multiple teams wait for the DBA team to identify the root cause.

The investigation followed a common Oracle DBA diagnostic workflow: identify workload patterns using AWR, analyze session-level activity using ASH, and investigate problematic SQL execution using SQL Monitor. These three tools form the core performance troubleshooting toolkit used by Oracle DBAs in production environments.

1. The Oracle Performance Troubleshooting Trinity

ToolPurpose
AWRHistorical workload analysis
ASHSession-level activity sampling
SQL MonitorReal-time SQL execution visibility

Using these tools together allows DBAs to move efficiently from symptom to root cause.

Oracle Performance Troubleshooting Workflow

  1. Identify workload changes — Review AWR load profile and wait events to determine whether the issue is caused by increased workload or resource contention.
  2. Identify dominant wait events — Examine the Top Timed Events section in the AWR report.
  3. Identify problematic SQL — Review SQL ordered by elapsed time to identify expensive queries.
  4. Investigate session behavior — Use ASH to identify blocking sessions, hot objects, or concurrency problems.
  5. Analyze execution plans — Use SQL Monitor or execution plan analysis to understand query behavior.
  6. Implement corrective action — Possible fixes include index creation, SQL tuning, statistics refresh, or application-level adjustments.

2. AWR Fundamentals: What It Measures and Why It Matters

The Automatic Workload Repository (AWR) stores historical database performance metrics captured at regular intervals. By default Oracle creates snapshots every 60 minutes, storing metrics such as wait events, SQL execution statistics, system statistics, memory utilization, and I/O activity. These snapshots allow DBAs to analyze workload trends over time.

What AWR Captures

Metric CategoryDescriptionWhy It Matters
Wait EventsTime spent waiting for resourcesIdentifies performance bottlenecks
Top SQLSQL statements consuming most resourcesPrioritize tuning efforts
System StatisticsDatabase-level metricsUnderstand workload behavior
Memory UsageSGA and PGA statisticsDetect memory pressure
I/O StatisticsDatafile read and write activityIdentify storage issues

Generating an AWR Report

SQL — Generate AWR Report
@?/rdbms/admin/awrrpt.sql -- Retrieve snapshot IDs SELECT snap_id, begin_interval_time, end_interval_time FROM dba_hist_snapshot ORDER BY snap_id DESC;

Adjusting Snapshot Retention

SQL — Configure AWR Snapshot Settings
SELECT snap_interval, retention FROM dba_hist_wr_control; -- Modify snapshot settings (30 days retention, 30 min interval) BEGIN DBMS_WORKLOAD_REPOSITORY.MODIFY_SNAPSHOT_SETTINGS( retention => 43200, interval => 30 ); END; /
Licensing Note

AWR, ASH, and SQL Monitor are part of the Oracle Diagnostics Pack. Using these features requires a Diagnostics Pack license: AWR reports, ASH queries, DBA_HIST views, SQL Monitor reports, and ADDM analysis. If Diagnostics Pack is not licensed, DBAs should instead rely on Statspack, V$SESSION, V$SYSTEM_EVENT, or V$SQL. Always verify licensing compliance before using these features in production environments.

3. Reading AWR Reports: Focus on These Sections First

Although AWR reports can exceed 80 pages, the most important sections include: Top Timed Foreground Events, SQL Ordered by Elapsed Time, Load Profile, and Instance Efficiency Metrics.

Top Timed Foreground Events

This is the most important section. It shows where your database spent the most time. High values for db file sequential read often indicate index-driven workloads, inefficient SQL access patterns, storage latency, or missing indexes.

AWR Top Timed Foreground Events
Figure 1: AWR Top Timed Foreground Events
62% of DB time in db file sequential read confirms an I/O-bound workload. This diagnostic was captured during a real production performance investigation on a high-concurrency OLTP system.

Identifying Expensive SQL

SQL — Extract Top SQL from AWR
SELECT sql_id, plan_hash_value, executions_delta AS executions, ROUND(elapsed_time_delta/1000000,2) AS elapsed_seconds, buffer_gets_delta, disk_reads_delta FROM dba_hist_sqlstat WHERE snap_id BETWEEN :begin_snap AND :end_snap ORDER BY elapsed_time_delta DESC FETCH FIRST 20 ROWS ONLY;

4. ASH Deep Dive: Real-Time Session Analysis

Active Session History (ASH) samples active sessions roughly once per second. Each sample records information such as the SQL ID, session state, wait event, blocking session, and object being accessed.

ASH helps answer critical questions during performance incidents: which sessions are waiting, which SQL statements are responsible, and which objects are involved.

When to Use ASH vs AWR

Use AWR when analyzing performance over hours or days, finding trends and patterns, or comparing current performance to a historical baseline.

Use ASH when troubleshooting active performance issues in real time, identifying which sessions cause specific wait events, or finding contention on specific objects.

Real-Time Session Analysis

SQL — ASH: What’s Running Right Now
SELECT session_id, session_serial#, sql_id, event, wait_class, blocking_session FROM v$active_session_history WHERE sample_time > SYSDATE - INTERVAL '15' MINUTE ORDER BY sample_time DESC;

Advanced ASH Blocking Analysis

SQL — ASH: Identify Blocking Sessions
SELECT blocking_session, COUNT(*) waiting_sessions FROM v$active_session_history WHERE blocking_session IS NOT NULL GROUP BY blocking_session ORDER BY waiting_sessions DESC;

This query quickly identifies sessions responsible for widespread blocking across the database.

ASH Blocking Session Analysis
Figure 2: ASH Active Session Blocking Chain
SID 892 identified as the root blocking session holding a TX lock, causing 47 sessions to wait. ASH session sampling is particularly valuable for diagnosing real-time database contention.

5. SQL Monitor: Real-Time Query Execution Visibility

SQL Monitor provides real-time execution statistics for SQL statements. Unlike EXPLAIN PLAN, SQL Monitor displays actual row counts, operation-level wait events, execution progress, and parallel query distribution.

What Makes SQL Monitor Different

FeatureEXPLAIN PLANSQL Monitor
Shows predicted planYesYes
Shows actual executionNoYes
Actual row countsNo (estimated)Yes (actual)
Wait events per operationNoYes
Real-time progressNoYes

Viewing Monitored SQL

SQL — List Monitored SQL Statements
SELECT sql_id, status, username, elapsed_time/1000000 AS elapsed_seconds FROM v$sql_monitor ORDER BY sql_exec_start DESC;

Generate SQL Monitor Report

SQL — Generate SQL Monitor Report (HTML)
SELECT DBMS_SQLTUNE.REPORT_SQL_MONITOR( sql_id => 'your_sql_id', type => 'HTML', report_level => 'ALL' ) FROM dual;

Interpreting SQL Monitor Output

SQL Monitor reports often reveal execution plan issues immediately. A cardinality mismatch — where the optimizer estimates 1,000 rows but 10 million are actually processed — can lead to incorrect join methods, excessive buffer gets, full table scans, and high CPU usage. Such discrepancies are a common root cause of performance degradation in OLTP environments.

SQL Monitor Execution Plan Analysis
Figure 3: SQL Monitor — Real-Time Execution Plan Analysis
Cardinality mismatch detected: optimizer estimated 1,000 rows but 10,842,201 rows were actually processed. Stale statistics caused a HASH JOIN with TABLE ACCESS FULL, resulting in 127 seconds of elapsed time.

6. Example Production Investigation

In one investigation, an AWR report showed log file sync contributing to nearly 40% of database wait time. ASH analysis revealed hundreds of sessions waiting on commit operations.

Further investigation identified an application batch process committing after every single row instead of batching transactions. After adjusting the commit strategy in the application layer, commit frequency dropped significantly and the database response time returned to normal levels.

This example illustrates how combining AWR historical analysis with ASH session sampling helps identify the true root cause of performance problems.

7. Troubleshooting Scenario: CPU Bottleneck

Symptom: Users report application slowness. AWR report shows 80% of database time is CPU.

Common causes of high CPU:

  • Issue Inefficient SQL execution plans — full table scans on large tables
  • Issue Excessive hard parsing — no bind variables in application SQL
  • Issue Application connection storms — too many sessions opening simultaneously
  • Issue Expensive PL/SQL function calls in SQL WHERE clauses
SQL — Top SQL by CPU Time
SELECT sql_id, executions_delta, ROUND(cpu_time_delta/1000000,2) AS cpu_seconds FROM dba_hist_sqlstat ORDER BY cpu_time_delta DESC FETCH FIRST 20 ROWS ONLY;

Quick wins:

  • Fix Add missing indexes to eliminate full table scans
  • Fix Convert literals to bind variables to reduce hard parsing
  • Fix Implement connection pooling to reduce session overhead
  • Fix Rewrite SQL to avoid expensive functions in WHERE clauses

8. Troubleshooting Scenario: Lock Contention

Symptom: Users report "hanging" transactions. AWR shows high enq: TX - row lock contention wait time.

SQL — Current Lock Contention
-- Currently blocked sessions SELECT blocking_session, sid, serial#, username, seconds_in_wait FROM v$session WHERE blocking_session IS NOT NULL; -- Historical lock analysis with ASH (last 2 hours) SELECT sql_id, blocking_session, COUNT(*) AS lock_samples, ROUND(COUNT(*) * 1, 2) AS approx_wait_seconds, MIN(sample_time) AS first_seen, MAX(sample_time) AS last_seen FROM dba_hist_active_sess_history WHERE sample_time > SYSDATE - INTERVAL '2' HOUR AND event LIKE '%enq: TX%' AND blocking_session IS NOT NULL GROUP BY sql_id, blocking_session ORDER BY lock_samples DESC;

9. DBA Checklist During Performance Incidents

Database Layer

  • Check active session count
  • Identify dominant wait events via AWR Top Timed Events
  • Review top SQL by elapsed time

SQL Layer

  • Examine execution plans with SQL Monitor
  • Verify statistics freshness
  • Check for unexpected full table scans

Infrastructure Layer

  • Monitor OS CPU usage from AWR OS Stats
  • Check I/O latency per datafile
  • Validate storage performance

Application Layer

  • Connection pool saturation
  • Inefficient or unparameterized queries
  • Sudden workload spikes or batch jobs

10. Key Takeaways

Effective Oracle performance troubleshooting requires a structured diagnostic approach.

  • AWR Use to understand workload patterns and trends over time.
  • ASH Use to analyze session-level activity and identify blocking chains.
  • SQL Monitor Use to investigate SQL execution behavior in real time.
  • Combined Combining these tools allows DBAs to resolve performance incidents efficiently.
  • License Always verify Diagnostics Pack licensing before using AWR/ASH in production.

About the Author

Chetan Yadav

Chetan Yadav is a Senior Oracle, PostgreSQL, MySQL, and Cloud DBA with 15+ years of hands-on experience managing production databases across on-premises, hybrid, and cloud environments.

His expertise includes Oracle performance tuning, high availability architecture, database migrations, disaster recovery, and cloud database platforms. He regularly shares practical DBA insights based on real production troubleshooting experience.

This blog focuses on real-world DBA problems, career growth, and practical learning — not theoretical documentation or vendor marketing.

Monday, February 16, 2026

Automating Backup and Restore in Oracle 19c and 23ai: Complete DBA Guide

Automating Backup and Restore in Oracle 19c and 23ai: Complete DBA Guide

Automating Backup and Restore in Oracle 19c and 23ai: Complete DBA Guide

Production-Tested RMAN Automation Scripts and Recovery Strategies
📅 February 05, 2026
👤 Chetan Yadav - Senior Oracle & Cloud DBA
⏱️ 18-20 min read
⏱️ Estimated Reading Time: 18–20 minutes
💾 Oracle RMAN Automation - From Manual Backups to Fully Automated Recovery

At 3 AM on a Tuesday, our production Oracle 19c database crashed. Corrupted datafile. The application team was screaming. The CTO was on the call. Everyone looked at me.

I typed one command: ./restore_prod.sh PRODDB 2026-02-04_23:00. Twenty-three minutes later, the database was back online with zero data loss. The automated backup and restore framework I'd built six months earlier just saved our jobs.

Automated backup systems and data storage visualization representing Oracle RMAN backup automation and disaster recovery infrastructure

Manual backup and restore processes are where Oracle DBAs lose the most time. Automating RMAN backups isn't about convenience—it's about reliability, consistency, and being able to restore in minutes instead of hours when production is down.

This guide covers production-tested automation frameworks for Oracle 19c and 23ai. If you're still running manual RMAN scripts or struggling with backup consistency, these patterns will save you hours every week and make disaster recovery predictable.

Friday, January 23, 2026

Oracle Database 23ai: Revolutionizing Data Distribution Across the Globe

Oracle Database 23ai: Revolutionizing Data Distribution Across the Globe

A Journey Through Distributed Database Innovation with François Pons

📅 January 23, 2026 👤 Chetan Yadav - Oracle ACE Apprentice ⏱️ 10-15 min read

🌍 Oracle Globally Distributed Database - Global Scale, Local Performance

⏱️ Estimated Reading Time: 10-15 minutes

🎯 My Journey as an Oracle ACE Apprentice: Uncovering Database Innovation

When I first received my acceptance into the Oracle ACE Apprentice program, I knew I'd be diving deep into Oracle technologies. One of my initial tasks was to review and showcase product releases through demonstrations and write-ups. I chose to explore Oracle Database 23ai's Globally Distributed Database feature, and what I discovered genuinely surprised me.

This wasn't just another database update—this was a complete reimagining of how we think about data distribution, scalability, and geographic compliance. The presentation by François Pons, Senior Principal Product Manager at Oracle, opened my eyes to capabilities I didn't even know were possible in enterprise databases.

💡 Why This Matters: As part of my Oracle ACE Apprentice journey, I'm required to demonstrate Oracle product usage by submitting three demonstrations within the first 60 days. This deep dive into globally distributed databases represents one of those demonstrations, and it turned out to be far more inspiring than I initially expected.

🎤 What Makes This Presentation Stand Out

François Pons doesn't just walk through technical specifications; he tells a story about solving real business problems. From the moment he begins explaining distributed databases, you realize this technology addresses challenges that keep CTOs awake at night: how to scale infinitely, how to survive disasters, and how to comply with data sovereignty laws across multiple countries.

What struck me most was the elegance of the solution. Oracle hasn't just bolted on distributed capabilities to their existing database—they've fundamentally rethought how data can be spread across the globe while maintaining the full power of SQL and ACID transactions.

"All the benefits of a distributed database, without the compromises. Why settle for less?" - François Pons
Distributed Database Concept

Basic Distributed Database Architecture: Application connects to multiple shards

🧩 Understanding Distributed Databases: Breaking It Down

Let me share what I learned from this presentation in a way that makes sense, even if you're new to distributed database concepts.

The Core Concept

A distributed database stores data across multiple physical locations instead of keeping everything in one place. Think of it like having multiple bank branches instead of one central vault. Each location (called a "shard") stores a subset of your data, but applications interact with it as if it were a single, unified database.

The beauty? Your applications don't need to know where the data physically resides. Oracle handles all the complexity behind the scenes.

Why This Matters in 2026

François highlighted two primary use cases that resonated with me:

1️⃣ Ultimate Scalability and Survivability

When your application grows beyond what a single database can handle—even a powerful clustered database—distributed architecture becomes essential. Oracle's approach lets you scale horizontally by adding more shards, each potentially running on commodity hardware or in different cloud providers.

2️⃣ Data Sovereignty Compliance

With regulations like GDPR in Europe, data localization laws in China, and similar requirements worldwide, companies need to ensure specific data stays in specific geographic regions. Oracle's value-based sharding makes this straightforward: European customer data stays on European servers, American data stays in America, and so on.

Value-Based Sharding

Value-Based Sharding: Data distributed by geography for sovereignty compliance

🚀 The Technical Innovations That Impressed Me

Multiple Data Distribution Methods

Oracle doesn't force you into a one-size-fits-all approach. François explains four different distribution strategies:

  • Value-Based Sharding: Distribute data by specific values like country or product category. Perfect for data sovereignty requirements where you need to guarantee data residency.
  • System-Managed (Hash-Based) Sharding: Uses consistent hashing to evenly distribute data across shards. Ideal when you need balanced performance and don't have geographic constraints.
  • Composite Sharding: Combines value-based and hash-based approaches. For example, first distribute by country, then within each country distribute evenly across multiple shards by customer ID.
  • Duplicated Tables: Small, read-mostly reference tables can be duplicated across all shards to avoid cross-shard queries.

Replication Strategies: Where Innovation Shines

🆕 Raft-Based Replication (New in 23ai)

This is the game-changer François seemed most excited about. Based on the popular Raft consensus protocol, it provides:

  • Automatic failover in under 3 seconds
  • Zero data loss through synchronous replication
  • Active-active symmetric configuration where each shard accepts both reads and writes
  • No need to configure Data Guard or GoldenGate separately

⚡ Performance Note: The Raft implementation particularly impressed me because it addresses a common distributed database challenge: achieving both high availability and data consistency without complex manual configuration.

🌐 Deployment Flexibility: Oracle Meets You Where You Are

One aspect François emphasized that I found particularly practical: Oracle doesn't dictate your infrastructure choices. You can deploy shards:

  • On independent commodity servers (simple, low-cost)
  • On fault-tolerant RAC clusters (combining distributed and clustered architectures)
  • Across multiple clouds (OCI, AWS, Azure)
  • In hybrid on-premises and cloud configurations

💼 Real-World Use Cases

François showcased several application types already using Oracle Globally Distributed Database:

  • 📱 Mobile messaging platforms: Require massive scale and low latency worldwide
  • 💳 Payment processing: Needs transaction consistency and regulatory compliance
  • 🔍 Credit card fraud detection: Demands real-time processing across regions
  • 🌐 IoT applications: Like smart power meters generating enormous data volumes
  • 🖥️ Internet infrastructure: Supporting critical distributed services

🤖 The Autonomous Advantage

While François covered the core distributed database technology, he also highlighted Oracle Globally Distributed Autonomous Database, which adds automated management to eliminate operational complexity.

🎬 What the Demo Revealed

The live demonstration François provided showed just how straightforward the setup process has become. Using the Oracle Cloud interface, he displayed a map-based configuration where you simply click regions to place shards.

💡 My Key Takeaways as an ACE Apprentice

Key Insights

  • Oracle is solving real business problems, not just adding features. Every capability François described addresses actual challenges companies face when scaling globally.
  • The convergence of distributed and clustered architectures is powerful. You don't have to choose between RAC's local performance and sharding's global scale—you can have both.
  • Raft replication represents a significant step forward. Three-second automatic failover with zero data loss is exactly what distributed applications need.

🔮 Looking Forward: The Broader Implications

Multi-cloud becomes practical

When you can seamlessly deploy across OCI, AWS, and Azure in a single distributed database, you're no longer locked into one vendor's ecosystem.

Global applications become easier

Developers can focus on application logic rather than data distribution complexity.

📚 Resources and Next Steps

If you're interested in exploring Oracle Database 23ai's Globally Distributed Database further, I recommend:

  1. Watch François Pons's complete presentation on the Oracle Developers YouTube channel
  2. Visit oracle.com/database/distributed-database for comprehensive documentation
  3. Try the free tier on Oracle Cloud to experiment hands-on
  4. Review the Oracle 23ai documentation on Raft replication

📢 Found this helpful? Share it!

#OracleDatabase #Oracle23ai #DistributedDatabases #OracleACE #CloudDatabases #RaftReplication

About the Author

CY

Chetan Yadav

Oracle ACE Apprentice | Senior Oracle & Cloud DBA

This blog post was created as part of my Oracle ACE Apprentice journey, where I'm exploring and demonstrating Oracle product innovations. The insights shared here come from my review of François Pons's excellent presentation on Oracle Database 23ai's Globally Distributed Database capabilities.

Connect & Learn More:
📊 LinkedIn Profile | 🎥 YouTube Channel