Oracle Performance Engineering Guide: AWR, ASH and SQL Monitor in Oracle 19c and 23ai
The troubleshooting approach described in this article reflects common production environments such as:
Oracle Database 19c • 2-Node RAC Cluster • 8 TB OLTP Database • ~2,500 Concurrent Sessions • Peak ~1,500 TPS
While the SQL queries shown are simplified for educational purposes, the investigation methodology mirrors real production troubleshooting practices used by experienced DBAs.
During a recent performance investigation, an OLTP database began experiencing significant latency during peak traffic. Average response time increased from roughly 300 milliseconds to more than 5 seconds.
Application teams initially suspected CPU saturation. Infrastructure teams suspected storage latency. Situations like this are common during performance incidents where multiple teams wait for the DBA team to identify the root cause.
The investigation followed a common Oracle DBA diagnostic workflow: identify workload patterns using AWR, analyze session-level activity using ASH, and investigate problematic SQL execution using SQL Monitor. These three tools form the core performance troubleshooting toolkit used by Oracle DBAs in production environments.
- The Oracle Performance Troubleshooting Trinity
- AWR Fundamentals: What It Measures and Why It Matters
- Reading AWR Reports: Focus on These Sections First
- ASH Deep Dive: Real-Time Session Analysis
- SQL Monitor: Real-Time Query Execution Visibility
- Example Production Investigation
- Troubleshooting Scenario: CPU Bottleneck
- Troubleshooting Scenario: Lock Contention
- DBA Checklist During Performance Incidents
- Key Takeaways
1. The Oracle Performance Troubleshooting Trinity
| Tool | Purpose |
|---|---|
| AWR | Historical workload analysis |
| ASH | Session-level activity sampling |
| SQL Monitor | Real-time SQL execution visibility |
Using these tools together allows DBAs to move efficiently from symptom to root cause.
Oracle Performance Troubleshooting Workflow
- Identify workload changes — Review AWR load profile and wait events to determine whether the issue is caused by increased workload or resource contention.
- Identify dominant wait events — Examine the Top Timed Events section in the AWR report.
- Identify problematic SQL — Review SQL ordered by elapsed time to identify expensive queries.
- Investigate session behavior — Use ASH to identify blocking sessions, hot objects, or concurrency problems.
- Analyze execution plans — Use SQL Monitor or execution plan analysis to understand query behavior.
- Implement corrective action — Possible fixes include index creation, SQL tuning, statistics refresh, or application-level adjustments.
2. AWR Fundamentals: What It Measures and Why It Matters
The Automatic Workload Repository (AWR) stores historical database performance metrics captured at regular intervals. By default Oracle creates snapshots every 60 minutes, storing metrics such as wait events, SQL execution statistics, system statistics, memory utilization, and I/O activity. These snapshots allow DBAs to analyze workload trends over time.
What AWR Captures
| Metric Category | Description | Why It Matters |
|---|---|---|
| Wait Events | Time spent waiting for resources | Identifies performance bottlenecks |
| Top SQL | SQL statements consuming most resources | Prioritize tuning efforts |
| System Statistics | Database-level metrics | Understand workload behavior |
| Memory Usage | SGA and PGA statistics | Detect memory pressure |
| I/O Statistics | Datafile read and write activity | Identify storage issues |
Generating an AWR Report
Adjusting Snapshot Retention
AWR, ASH, and SQL Monitor are part of the Oracle Diagnostics Pack. Using these features requires a Diagnostics Pack license: AWR reports, ASH queries, DBA_HIST views, SQL Monitor reports, and ADDM analysis. If Diagnostics Pack is not licensed, DBAs should instead rely on Statspack, V$SESSION, V$SYSTEM_EVENT, or V$SQL. Always verify licensing compliance before using these features in production environments.
3. Reading AWR Reports: Focus on These Sections First
Although AWR reports can exceed 80 pages, the most important sections include: Top Timed Foreground Events, SQL Ordered by Elapsed Time, Load Profile, and Instance Efficiency Metrics.
Top Timed Foreground Events
This is the most important section. It shows where your database spent the most time. High values for db file sequential read often indicate index-driven workloads, inefficient SQL access patterns, storage latency, or missing indexes.
Identifying Expensive SQL
4. ASH Deep Dive: Real-Time Session Analysis
Active Session History (ASH) samples active sessions roughly once per second. Each sample records information such as the SQL ID, session state, wait event, blocking session, and object being accessed.
ASH helps answer critical questions during performance incidents: which sessions are waiting, which SQL statements are responsible, and which objects are involved.
When to Use ASH vs AWR
Use AWR when analyzing performance over hours or days, finding trends and patterns, or comparing current performance to a historical baseline.
Use ASH when troubleshooting active performance issues in real time, identifying which sessions cause specific wait events, or finding contention on specific objects.
Real-Time Session Analysis
Advanced ASH Blocking Analysis
This query quickly identifies sessions responsible for widespread blocking across the database.
5. SQL Monitor: Real-Time Query Execution Visibility
SQL Monitor provides real-time execution statistics for SQL statements. Unlike EXPLAIN PLAN, SQL Monitor displays actual row counts, operation-level wait events, execution progress, and parallel query distribution.
What Makes SQL Monitor Different
| Feature | EXPLAIN PLAN | SQL Monitor |
|---|---|---|
| Shows predicted plan | Yes | Yes |
| Shows actual execution | No | Yes |
| Actual row counts | No (estimated) | Yes (actual) |
| Wait events per operation | No | Yes |
| Real-time progress | No | Yes |
Viewing Monitored SQL
Generate SQL Monitor Report
Interpreting SQL Monitor Output
SQL Monitor reports often reveal execution plan issues immediately. A cardinality mismatch — where the optimizer estimates 1,000 rows but 10 million are actually processed — can lead to incorrect join methods, excessive buffer gets, full table scans, and high CPU usage. Such discrepancies are a common root cause of performance degradation in OLTP environments.
6. Example Production Investigation
In one investigation, an AWR report showed log file sync contributing to nearly 40% of database wait time. ASH analysis revealed hundreds of sessions waiting on commit operations.
Further investigation identified an application batch process committing after every single row instead of batching transactions. After adjusting the commit strategy in the application layer, commit frequency dropped significantly and the database response time returned to normal levels.
This example illustrates how combining AWR historical analysis with ASH session sampling helps identify the true root cause of performance problems.
7. Troubleshooting Scenario: CPU Bottleneck
Symptom: Users report application slowness. AWR report shows 80% of database time is CPU.
Common causes of high CPU:
- Issue Inefficient SQL execution plans — full table scans on large tables
- Issue Excessive hard parsing — no bind variables in application SQL
- Issue Application connection storms — too many sessions opening simultaneously
- Issue Expensive PL/SQL function calls in SQL WHERE clauses
Quick wins:
- Fix Add missing indexes to eliminate full table scans
- Fix Convert literals to bind variables to reduce hard parsing
- Fix Implement connection pooling to reduce session overhead
- Fix Rewrite SQL to avoid expensive functions in WHERE clauses
8. Troubleshooting Scenario: Lock Contention
Symptom: Users report "hanging" transactions. AWR shows high enq: TX - row lock contention wait time.
9. DBA Checklist During Performance Incidents
Database Layer
- Check active session count
- Identify dominant wait events via AWR Top Timed Events
- Review top SQL by elapsed time
SQL Layer
- Examine execution plans with SQL Monitor
- Verify statistics freshness
- Check for unexpected full table scans
Infrastructure Layer
- Monitor OS CPU usage from AWR OS Stats
- Check I/O latency per datafile
- Validate storage performance
Application Layer
- Connection pool saturation
- Inefficient or unparameterized queries
- Sudden workload spikes or batch jobs
10. Key Takeaways
Effective Oracle performance troubleshooting requires a structured diagnostic approach.
- AWR Use to understand workload patterns and trends over time.
- ASH Use to analyze session-level activity and identify blocking chains.
- SQL Monitor Use to investigate SQL execution behavior in real time.
- Combined Combining these tools allows DBAs to resolve performance incidents efficiently.
- License Always verify Diagnostics Pack licensing before using AWR/ASH in production.