Monday, March 2, 2026

Oracle Performance Engineering Guide: AWR, ASH and SQL Monitor in 19c and 23ai

Oracle Performance Engineering Guide: AWR, ASH and SQL Monitor in 19c and 23ai

Oracle Performance Engineering Guide: AWR, ASH and SQL Monitor in 19c and 23ai

Real Production Troubleshooting with Oracle's Performance Diagnostic Tools
📅 March 02, 2026
👤 Chetan Yadav - Senior Oracle & Cloud DBA
⏱️ 12-14 min read
⏱️ Estimated Reading Time: 12–14 minutes
📊 Master AWR, ASH, and SQL Monitor - The Oracle DBA's Performance Troubleshooting Trinity

Production was slow. Users were complaining. The application team blamed the database. I had 30 minutes to find the root cause before the CTO joined the war room call.

I opened an AWR report. Buffer busy waits: 87% of total wait time. But which sessions? Which objects? AWR doesn't tell you. That's when I switched to ASH. Three sessions hammering the same table block. SQL Monitor showed me the exact query execution plan with real-time wait events. Problem identified: missing index on a foreign key. Index created. Performance restored. Crisis averted.

Database performance monitoring dashboard showing analytics metrics graphs and charts representing Oracle AWR ASH SQL Monitor diagnostics

AWR, ASH, and SQL Monitor are Oracle's three essential performance diagnostic tools. AWR gives you the big picture over time. ASH shows you what's happening right now at the session level. SQL Monitor reveals how individual queries execute in real-time. Together, they form the complete performance troubleshooting toolkit that every Oracle DBA needs to master.

This guide covers real production troubleshooting techniques using AWR, ASH, and SQL Monitor in Oracle 19c and 23ai. If you're tired of guessing at performance problems or spending hours digging through logs, these are the diagnostic queries and analysis patterns that actually work.

1. AWR Fundamentals: What It Measures and Why It Matters

Automatic Workload Repository (AWR) is Oracle's time-series performance data warehouse. Every hour (by default), Oracle captures a snapshot of database performance metrics.

What AWR Captures

Metric Category What It Measures Why It Matters
Wait Events Where database spent time waiting Identifies bottlenecks (I/O, CPU, locks)
Top SQL Most resource-intensive queries Find queries to optimize first
System Statistics DB-wide metrics (commits, reads, writes) Understand workload characteristics
Memory Usage SGA, PGA, buffer cache hit ratios Memory tuning opportunities
I/O Statistics Datafile reads/writes, latency Storage performance issues

AWR vs Statspack

AWR replaced Statspack in Oracle 10g. Key differences:

  • AWR: Requires Diagnostics Pack license (paid), automatic snapshots, more detailed data
  • Statspack: Free, manual snapshots, less detail but still useful
  • Common mistake: Running AWR reports without proper licensing (Oracle audits this)
SQL - Generate AWR Report
-- Connect as SYS or user with DBA privileges @?/rdbms/admin/awrrpt.sql -- Or generate programmatically SELECT output FROM TABLE(DBMS_WORKLOAD_REPOSITORY.AWR_REPORT_HTML( l_dbid => (SELECT dbid FROM v$database), l_inst_num => 1, l_bid => 12345, -- Begin snapshot ID l_eid => 12350 -- End snapshot ID )); -- Find recent snapshot IDs SELECT snap_id, TO_CHAR(begin_interval_time, 'YYYY-MM-DD HH24:MI') AS snap_time, TO_CHAR(end_interval_time, 'YYYY-MM-DD HH24:MI') AS end_time FROM dba_hist_snapshot WHERE begin_interval_time > SYSDATE - 2 ORDER BY snap_id DESC;

AWR Configuration Best Practices

SQL - Configure AWR Snapshot Settings
-- Check current AWR settings SELECT snap_interval, retention FROM dba_hist_wr_control; -- Modify AWR snapshot frequency and retention BEGIN DBMS_WORKLOAD_REPOSITORY.MODIFY_SNAPSHOT_SETTINGS( retention => 43200, -- 30 days (in minutes) interval => 30 -- Snapshot every 30 minutes ); END; / -- Create manual snapshot (useful during performance issues) EXEC DBMS_WORKLOAD_REPOSITORY.CREATE_SNAPSHOT;

2. Reading AWR Reports: Focus on These Sections First

AWR reports are 50-100 pages long. Here's what actually matters when troubleshooting.

Section 1: Top 5 Timed Foreground Events

This is the most important section. It shows where your database spent the most time.

Example analysis:

Event Wait Time (sec) % of Total What It Means
db file sequential read 45,230 62% Single-block reads (index scans)
CPU time 18,450 25% Actual CPU processing
log file sync 5,890 8% Commits waiting for redo
db file scattered read 2,120 3% Multi-block reads (full scans)
enq: TX - row lock 1,450 2% Row-level lock contention

Analysis: This system is I/O bound (62% in db file sequential read). Focus on:

  • Which SQL statements cause most sequential reads
  • Missing indexes forcing index scans instead of seeks
  • Storage latency issues

Section 2: SQL Ordered by Elapsed Time

Shows the most expensive queries by total elapsed time. These are your tuning targets.

SQL - Extract Top SQL from AWR
-- Top SQL by elapsed time (from AWR snapshots) SELECT sql_id, plan_hash_value, executions_delta AS executions, ROUND(elapsed_time_delta/1000000, 2) AS elapsed_sec, ROUND(cpu_time_delta/1000000, 2) AS cpu_sec, ROUND(iowait_delta/1000000, 2) AS io_wait_sec, buffer_gets_delta AS buffer_gets, disk_reads_delta AS disk_reads FROM dba_hist_sqlstat WHERE snap_id BETWEEN 12345 AND 12350 AND executions_delta > 0 ORDER BY elapsed_time_delta DESC FETCH FIRST 20 ROWS ONLY;

Section 3: Load Profile

Gives you per-second and per-transaction metrics. Compare against baseline to spot anomalies.

Key metrics to watch:

  • Logical Reads/sec: Buffer cache activity
  • Physical Reads/sec: Disk I/O activity
  • Executions/sec: Query throughput
  • Transactions/sec: Business transaction rate
  • Commits/sec: Transaction commit rate

3. ASH Deep Dive: Real-Time Session Analysis

Active Session History (ASH) samples active sessions every second. Unlike AWR which gives you hourly summaries, ASH shows you exactly what each session was doing at specific points in time.

When to Use ASH vs AWR

Use AWR when:

  • Analyzing performance over hours or days
  • Finding trends and patterns
  • Comparing current performance to historical baseline

Use ASH when:

  • Troubleshooting active performance issues (right now)
  • Identifying which sessions cause specific wait events
  • Finding contention on specific objects or SQL statements
SQL - ASH: What's Running Right Now
-- Real-time view of active sessions SELECT session_id, session_serial#, user_id, sql_id, sql_plan_hash_value, event, wait_class, wait_time_micro/1000000 AS wait_time_sec, blocking_session, current_obj#, program, module FROM v$active_session_history WHERE sample_time > SYSDATE - INTERVAL '15' MINUTE AND session_state = 'WAITING' ORDER BY sample_time DESC;
SQL - ASH: Top Wait Events (Last Hour)
-- What sessions spent most time waiting for (last hour) SELECT event, wait_class, COUNT(*) AS sample_count, ROUND(COUNT(*) / SUM(COUNT(*)) OVER () * 100, 2) AS pct_of_samples, ROUND(COUNT(*) * 1, 2) AS approx_seconds -- Each sample = 1 second FROM v$active_session_history WHERE sample_time > SYSDATE - INTERVAL '60' MINUTE AND event IS NOT NULL GROUP BY event, wait_class ORDER BY sample_count DESC FETCH FIRST 20 ROWS ONLY;

ASH: Finding Blocking Sessions

SQL - ASH: Identify Blocking Sessions
-- Find sessions causing locks SELECT blocking_session, blocking_session_serial#, COUNT(DISTINCT session_id) AS blocked_sessions, COUNT(*) AS total_blocking_samples, LISTAGG(DISTINCT event, ', ') WITHIN GROUP (ORDER BY event) AS wait_events FROM v$active_session_history WHERE blocking_session IS NOT NULL AND sample_time > SYSDATE - INTERVAL '30' MINUTE GROUP BY blocking_session, blocking_session_serial# ORDER BY total_blocking_samples DESC;

4. SQL Monitor: Real-Time Query Execution Visibility

SQL Monitor tracks long-running or resource-intensive queries in real-time. It shows you the execution plan with actual row counts, timing, and wait events for each plan operation.

What Makes SQL Monitor Different

Feature EXPLAIN PLAN SQL Monitor
Shows predicted plan Yes Yes
Shows actual execution No Yes
Actual row counts No (estimated) Yes (actual)
Wait events per operation No Yes
Real-time progress No Yes
SQL - Enable and View SQL Monitor
-- Enable SQL monitoring for a specific query ALTER SESSION SET STATISTICS_LEVEL = ALL; -- Or use hint for specific query SELECT /*+ MONITOR */ employee_id, first_name, last_name FROM employees WHERE department_id = 50; -- List monitored SQL statements SELECT sql_id, status, username, TO_CHAR(sql_exec_start, 'YYYY-MM-DD HH24:MI:SS') AS exec_start, elapsed_time/1000000 AS elapsed_sec, cpu_time/1000000 AS cpu_sec, buffer_gets, disk_reads FROM v$sql_monitor WHERE sql_exec_start > SYSDATE - 1 ORDER BY sql_exec_start DESC; -- Generate SQL Monitor report (HTML format) SELECT DBMS_SQLTUNE.REPORT_SQL_MONITOR( sql_id => 'a1b2c3d4e5f6g', type => 'HTML', report_level => 'ALL' ) AS report FROM DUAL;

Reading SQL Monitor Reports

Key sections to analyze:

  1. Plan Statistics: Compare estimated vs actual rows (cardinality misestimates indicate stale statistics)
  2. Activity Breakdown: Which plan operations consumed most time
  3. Wait Events: What each operation waited for (I/O, CPU, locks)
  4. Parallel Execution: Are parallel queries balanced across slaves

5. Troubleshooting Scenario #1: CPU Bottleneck

Symptom: Users report application slowness. AWR report shows 80% of database time is CPU.

Step 1: Confirm CPU Saturation

SQL - Check CPU Usage
-- OS CPU usage from AWR SELECT stat_name, value FROM dba_hist_osstat WHERE snap_id = (SELECT MAX(snap_id) FROM dba_hist_snapshot) AND stat_name IN ('NUM_CPUS', 'LOAD', 'BUSY_TIME', 'IDLE_TIME'); -- Top SQL by CPU time SELECT sql_id, plan_hash_value, executions_delta AS executions, ROUND(cpu_time_delta/1000000, 2) AS cpu_sec, ROUND((cpu_time_delta/1000000)/executions_delta, 4) AS cpu_per_exec FROM dba_hist_sqlstat WHERE snap_id BETWEEN 12345 AND 12350 AND executions_delta > 0 ORDER BY cpu_time_delta DESC FETCH FIRST 20 ROWS ONLY;

Step 2: Identify CPU-Intensive SQL

Common causes of high CPU:

  • Inefficient SQL: Full table scans on large tables
  • Parse storms: Excessive hard parsing (no bind variables)
  • Excessive function calls: PL/SQL functions in SQL WHERE clauses
  • Too many connections: Each session consumes CPU

Step 3: Resolution

Quick wins:

  • Add missing indexes to eliminate full table scans
  • Convert literals to bind variables to reduce parsing
  • Implement connection pooling to reduce session overhead
  • Rewrite SQL to avoid expensive functions in WHERE clauses

6. Troubleshooting Scenario #2: I/O Wait Analysis

Symptom: AWR report shows 70% wait time in "db file sequential read" and "db file scattered read".

Understanding I/O Wait Events

Wait Event What It Means Common Causes
db file sequential read Single-block read (index scan) Missing indexes, inefficient indexes
db file scattered read Multi-block read (full scan) Missing indexes, small tables (full scan OK)
direct path read Bypassing buffer cache Large parallel queries, LOBs
read by other session Waiting for block being read Hot blocks, buffer busy waits
SQL - Find SQL Causing Most I/O
-- Top SQL by physical reads SELECT sql_id, plan_hash_value, executions_delta AS executions, disk_reads_delta AS physical_reads, ROUND(disk_reads_delta/NULLIF(executions_delta,0), 2) AS reads_per_exec, ROUND((disk_reads_delta * 8192)/1024/1024, 2) AS data_read_mb FROM dba_hist_sqlstat WHERE snap_id BETWEEN 12345 AND 12350 AND executions_delta > 0 ORDER BY disk_reads_delta DESC FETCH FIRST 20 ROWS ONLY;

Identifying Which Objects Cause I/O

SQL - Top Segments by Physical Reads
-- Which tables/indexes cause most I/O SELECT owner, object_name, object_type, SUM(physical_reads_delta) AS total_reads, ROUND(SUM(physical_reads_delta * 8192)/1024/1024, 2) AS data_read_mb FROM dba_hist_seg_stat s JOIN dba_hist_seg_stat_obj o ON s.obj# = o.obj# AND s.dataobj# = o.dataobj# WHERE snap_id BETWEEN 12345 AND 12350 GROUP BY owner, object_name, object_type ORDER BY total_reads DESC FETCH FIRST 20 ROWS ONLY;

7. Troubleshooting Scenario #3: Lock Contention

Symptom: Users report "hanging" transactions. AWR shows high "enq: TX - row lock contention" wait time.

SQL - Current Lock Contention
-- Currently blocked sessions SELECT blocking_session, sid AS blocked_sid, serial#, username, sql_id, event, wait_time, seconds_in_wait, state FROM v$session WHERE blocking_session IS NOT NULL ORDER BY seconds_in_wait DESC; -- What the blocking session is doing SELECT s.sid, s.serial#, s.username, s.status, s.sql_id, sq.sql_text FROM v$session s LEFT JOIN v$sql sq ON s.sql_id = sq.sql_id WHERE s.sid IN ( SELECT DISTINCT blocking_session FROM v$session WHERE blocking_session IS NOT NULL );

Historical Lock Analysis with ASH

SQL - Historical Lock Contention (ASH)
-- Which SQL statements caused locks (last 2 hours) SELECT sql_id, blocking_session, COUNT(*) AS lock_samples, ROUND(COUNT(*) * 1, 2) AS approx_wait_seconds, MIN(sample_time) AS first_seen, MAX(sample_time) AS last_seen FROM dba_hist_active_sess_history WHERE sample_time > SYSDATE - INTERVAL '2' HOUR AND event LIKE '%enq: TX%' AND blocking_session IS NOT NULL GROUP BY sql_id, blocking_session ORDER BY lock_samples DESC;

8. Essential Diagnostic Queries Every DBA Needs

These are the queries I run most frequently when troubleshooting performance issues.

SQL - Database Health Check (One Query)
-- Comprehensive health check SELECT 'Active Sessions' AS metric, COUNT(*) AS value FROM v$session WHERE status = 'ACTIVE' UNION ALL SELECT 'Buffer Cache Hit Ratio', ROUND((1 - (phy.value / (cur.value + con.value))) * 100, 2) FROM v$sysstat cur, v$sysstat con, v$sysstat phy WHERE cur.name = 'db block gets' AND con.name = 'consistent gets' AND phy.name = 'physical reads' UNION ALL SELECT 'Library Cache Hit Ratio', ROUND((SUM(pins) - SUM(reloads)) / SUM(pins) * 100, 2) FROM v$librarycache UNION ALL SELECT 'PGA Memory Used (MB)', ROUND(SUM(value)/1024/1024, 2) FROM v$sesstat s, v$statname n WHERE s.statistic# = n.statistic# AND n.name = 'session pga memory';
SQL - Current Top 10 Queries by Execution Time
-- What's running slow right now SELECT sql_id, child_number, executions, ROUND(elapsed_time/1000000, 2) AS elapsed_sec, ROUND(cpu_time/1000000, 2) AS cpu_sec, buffer_gets, disk_reads, ROUND(elapsed_time/NULLIF(executions,0)/1000000, 4) AS sec_per_exec FROM v$sql WHERE executions > 0 AND elapsed_time > 1000000 -- More than 1 second total ORDER BY elapsed_time DESC FETCH FIRST 10 ROWS ONLY;

9. FAQ

Do I need a Diagnostics Pack license to use AWR?
Yes. AWR requires Oracle Diagnostics Pack, which is a separately licensed Enterprise Edition option. If you don't have this license, Oracle can audit and charge retroactively for every CPU core where AWR was used. Alternative: Use Statspack (free) which provides similar functionality with less detail. To check if Diagnostics Pack is licensed, run: SELECT * FROM v$option WHERE parameter = 'Diagnostics Pack'. If you're using Standard Edition, AWR is not available at all—use Statspack instead.
What's the difference between ASH and AWR?
AWR is a time-series data warehouse with hourly snapshots showing aggregate performance metrics. ASH is a real-time sampling tool that captures what every active session is doing every second. Think of it this way: AWR = "What happened over the last hour?" ASH = "What's happening right now with this specific session?" Use AWR for historical analysis and trend identification. Use ASH for active troubleshooting and session-level diagnostics. ASH data is eventually rolled up into AWR, so AWR contains ASH history but with less granularity.
How long should I keep AWR snapshots?
Default is 8 days (11,520 minutes). For production databases, I recommend 30-45 days (43,200-64,800 minutes). This gives you enough history for month-over-month comparisons and investigating issues that occurred weeks ago. Storage cost is minimal—AWR snapshots typically use 50-200MB per day. For critical systems, consider 60-90 days. Beyond 90 days, diminishing returns—you're more likely referencing different workload patterns. Use DBMS_WORKLOAD_REPOSITORY.MODIFY_SNAPSHOT_SETTINGS to adjust retention.
Should I mention AWR/ASH experience on my DBA resume?
Absolutely. Write: "Expert-level proficiency in Oracle performance diagnostics using AWR reports, ASH queries, and SQL Monitor for production troubleshooting. Reduced query execution time by X% through systematic analysis of wait events, execution plans, and system statistics." These tools are fundamental to Oracle DBA work—not mentioning them is like a developer not mentioning they know Git. Include specific examples: "Identified and resolved buffer busy wait contention using ASH historical data, reducing peak load response time from 8 seconds to 1.2 seconds."

10. Related Reading from Real Production Systems

About the Author

Chetan Yadav

Chetan Yadav is a Senior Oracle, PostgreSQL, MySQL, and Cloud DBA with 15+ years of hands-on experience managing production databases across on-premises, hybrid, and cloud environments. He specializes in high availability architecture, performance tuning, disaster recovery, and database migrations.

Throughout his career, Chetan has resolved hundreds of production performance issues using AWR, ASH, and SQL Monitor. He has optimized mission-critical Oracle databases serving millions of transactions daily in finance, healthcare, and e-commerce sectors.

Chetan is passionate about mentoring early-career DBAs and sharing real-world production lessons that aren't found in documentation. His writing focuses on practical decision-making, career growth, and systematic troubleshooting methodologies that work under pressure.

This blog focuses on real-world DBA problems, career growth, and practical learning — not theoretical documentation or vendor marketing.

No comments:

Post a Comment