Monday, June 8, 2026

Oracle High CPU Usage: Causes and Fix in 19c

Oracle High CPU Usage: Causes and Fix in 19c
Oracle 19c Performance Tuning

Oracle High CPU Usage: Causes and Fix in 19c

 Chetan Yadav — Senior Oracle DBA ⏱ 10 min read  Troubleshoot Guide

It is 3:12 AM. PagerDuty fires. The alert reads: Oracle host CPU at 94% for 8 minutes. You log into the server. The database process is consuming nearly every available core. The application is timing out. Users are getting errors. And nobody knows which query started it or when.

This is the oracle high cpu usage fix 19c scenario that every production DBA eventually faces. The frustrating part is not the incident itself — it is that in most cases the fix takes under ten minutes once you know exactly where to look. The problem is that without a structured diagnostic sequence, engineers burn 45 minutes checking the wrong things first.

This guide covers the three root causes behind the majority of Oracle 19c CPU spikes: runaway SQL and full table scans, parallel query overload, and latch contention in the shared pool. For each cause: exact diagnostic SQL, correct and wrong result interpretation, the fix, its trade-off, and the confirmation step.

Oracle high cpu usage fix 19c — diagnostic flowchart showing runaway SQL, parallel query overload, and latch contention paths with SQL commands

Cause 1: Runaway SQL and Oracle CPU Spike from Bad Execution Plans

The most common cause of a sudden Oracle CPU spike in 19c is a single SQL statement consuming a disproportionate share of CPU — typically because it is performing a full table scan on a large table, using a suboptimal join method, or has had its execution plan invalidated by a statistics refresh or cursor invalidation event. In Oracle 19c, Adaptive Query Optimization can occasionally select an execution plan that was correct for initial statistics but behaves catastrophically under production data distribution, and the optimizer will not self-correct until the cursor ages out of the shared pool.

The threshold where a runaway SQL becomes a host-level incident rather than a slow query is when it consumes more CPU than the OS can schedule on its available cores, creating a queue of waiting processes. On a 16-core host, a single parallel query with DOP 32 can drive CPU to 100% in under a minute.

Production Indicator If CPU spiked within minutes of a statistics job completion, an index drop, or an application deployment, assume a plan change first. This is the most common trigger in 19c environments.

Step 1: Find the top CPU-consuming SQL right now.

SQL — Top CPU Consumers from V$SQL
-- Run this first during the incident
SELECT sql_id,
       ROUND(cpu_time/1000000, 2) AS cpu_seconds,
       executions,
       ROUND(cpu_time/NULLIF(executions,0)/1000000,4) AS cpu_per_exec,
       SUBSTR(sql_text, 1, 80) sql_preview
FROM   v$sql
WHERE  last_active_time > SYSDATE - 1/24
ORDER BY cpu_time DESC
FETCH FIRST 10 ROWS ONLY;

Correct result: CPU seconds spread across many SQL IDs, no single statement above 20% of total. Wrong result: One SQL ID showing 80%+ of cumulative CPU seconds. That is your runaway statement.

Step 2: Check what is currently running on CPU using ASH.

SQL — Active Sessions ON CPU (ASH)
-- Shows sessions currently burning CPU right now
SELECT ash.sql_id,
       ash.session_id,
       ash.session_serial#,
       ash.module,
       COUNT(*) AS ash_samples
FROM   v$active_session_history ash
WHERE  ash.sample_time > SYSDATE - 5/1440
  AND  ash.session_state = 'ON CPU'
GROUP BY ash.sql_id, ash.session_id,
         ash.session_serial#, ash.module
ORDER BY ash_samples DESC;

Correct result: No SQL ID appearing in more than 20% of samples over 5 minutes. Wrong result: One SQL ID dominating ASH samples consistently — this is the active runaway. Note the SESSION_ID and SESSION_SERIAL#.

Fix: Kill the session, then fix the plan.

SQL — Kill Session + Apply SQL Profile
-- Step A: Kill the runaway session immediately
ALTER SYSTEM KILL SESSION '<sid>,<serial#>' IMMEDIATE;

-- Step B: Lock the good plan using a SQL Profile
-- First, get the sql_id and the plan hash of the good plan from AWR
DBMS_SQLTUNE.IMPORT_SQL_PROFILE(
  sql_text    => '<full SQL text>',
  profile     => SQLPROF_ATTR('OPT_PARAM(''optimizer_features_enable'' ''19.1.0'')'),
  name        => 'PROFILE_FIX_RUNAWAY_SQL',
  force_match => TRUE
);

-- Step C: Flush only the bad cursor from shared pool
EXEC SYS.DBMS_SHARED_POOL.PURGE('<address>,<hash_value>', 'C');
Trade-off Killing a session with IMMEDIATE rolls back its transaction, which may take time proportional to the undo generated. Do not kill sessions during a financial period close without confirming the transaction is safe to roll back. A SQL Profile is a better long-term fix than hints in the query, because it survives code deployments.

Cause 2: Parallel Query CPU Overload from Uncontrolled DOP in Oracle 19c

Parallel query is Oracle's most powerful performance feature and one of its most dangerous misuse patterns. In Oracle 19c with PARALLEL_DEGREE_POLICY = AUTO, the optimizer can autonomously decide to execute a full table scan or aggregation in parallel, potentially spawning PX slave processes well beyond what the host hardware can sustain. When the degree of parallelism exceeds the number of available CPU cores, the OS scheduler begins context-switching between PX slaves — and context switching itself consumes CPU, creating a feedback loop where adding more parallel slaves makes the query slower while driving CPU higher.

The specific threshold that changes behavior: on a host with N CPU cores, any query spawning more than N/2 PX slaves will begin competing with itself for CPU scheduler time. At N PX slaves, the overhead typically exceeds the parallelism benefit for most OLTP-adjacent workloads. This behavior is version-specific: 19c with AUTO policy is more aggressive than 12c in its autonomous DOP selection, and the PARALLEL_MIN_TIME_THRESHOLD parameter (default 10 seconds) may be set too low in environments that were migrated from earlier versions.

SQL — Identify Parallel Query Overload
-- Step 1: Count active PX slaves per coordinator session
SELECT qcsid,
       qcserial#,
       COUNT(*) AS px_slave_count,
       server_group
FROM   v$px_session
GROUP BY qcsid, qcserial#, server_group
ORDER BY px_slave_count DESC;

-- Step 2: Check current parallel parameters
SHOW PARAMETER parallel_max_servers;
SHOW PARAMETER parallel_degree_policy;
SHOW PARAMETER cpu_count;

Correct result: px_slave_count per coordinator below cpu_count / 2. Wrong result: Any coordinator showing px_slave_count above cpu_count — this coordinator is driving a CPU overload event. If parallel_degree_policy is AUTO and cpu_count is 32, a px_slave_count of 64 means the optimizer autonomously doubled available parallelism.

SQL — Fix Parallel Overload Without Restart
-- Immediate fix: limit DOP system-wide
ALTER SYSTEM SET parallel_degree_policy = MANUAL;
ALTER SYSTEM SET parallel_max_servers = 16; -- = cpu_count / 2

-- Raise the auto-parallel threshold to prevent future AUTO overselection
ALTER SYSTEM SET parallel_min_time_threshold = 30; -- seconds

-- Limit DOP for a specific table permanently
ALTER TABLE schema.large_table PARALLEL 4;

-- Confirm: verify no coordinator now exceeds cpu_count/2 slaves
SELECT qcsid, COUNT(*) slave_count
FROM v$px_session
GROUP BY qcsid
ORDER BY slave_count DESC;
Trade-off Switching from AUTO to MANUAL parallel degree policy disables automatic parallelism for full table scans and aggregations. Long-running batch jobs that relied on AUTO parallelism will run slower. Coordinate with the batch team before making this change permanent. A middle ground is setting PARALLEL_DEGREE_POLICY = LIMITED, which applies auto-DOP only to queries that already have a PARALLEL hint or table-level parallel degree set.

Cause 3: Latch Contention and Library Cache CPU Spin in Oracle 19c

Latch contention is the CPU spike cause that most engineers do not consider first, because it does not involve a visible slow query. When multiple sessions compete for the same latch — most commonly the library cache latch or the shared pool latch — Oracle sessions spin in a CPU-burning busy-wait loop before escalating to a sleep. This spin-wait is intentional: it is faster than a context switch for short contention windows. But when contention is sustained, hundreds of sessions spinning simultaneously drive CPU utilization to the ceiling while appearing idle from an application perspective.

In Oracle 19c, the most common trigger for library cache latch contention is hard parsing at high frequency. Hard parsing occurs when a SQL statement cannot be matched to an existing cursor in the shared pool, typically because literals are used instead of bind variables, because the shared pool is undersized for the workload, or because a cursor invalidation event (statistics refresh, DDL on an underlying object) forced mass re-parsing. The shared pool latch protects the memory management structures of the shared pool itself and contends under heavy allocation and deallocation activity.

SQL — Diagnose Latch Contention
-- Step 1: Find latches with high sleep rates
SELECT name,
       gets,
       misses,
       sleeps,
       spin_gets,
       ROUND(sleeps/NULLIF(gets,0)*100, 4) miss_pct
FROM   v$latch
WHERE  sleeps > 1000
ORDER BY sleeps DESC
FETCH FIRST 10 ROWS ONLY;

-- Step 2: Check hard parse rate
SELECT name, value
FROM   v$sysstat
WHERE  name IN ('parse count (hard)',
                'parse count (total)',
                'execute count')
ORDER BY name;

Correct result: miss_pct below 0.5% for library cache latch. Hard parse count below 5% of execute count. Wrong result: Library cache latch miss_pct above 1%, combined with hard parse count above 20% of execute count. This confirms literal SQL driving hard parses and latch contention.

SQL — Find Literal SQL Driving Hard Parses
-- Find SQL with high version counts (literal explosion)
SELECT sql_id,
       version_count,
       SUBSTR(sql_text, 1, 100) sql_preview
FROM   v$sqlarea
WHERE  version_count > 20
ORDER BY version_count DESC
FETCH FIRST 10 ROWS ONLY;

-- Fix: resize shared pool online
ALTER SYSTEM SET shared_pool_size = 2G SCOPE = BOTH;

-- Enable cursor sharing for literal SQL (temporary relief)
ALTER SYSTEM SET cursor_sharing = FORCE;

-- Confirm: flush shared pool then check latch sleeps trend
ALTER SYSTEM FLUSH SHARED_POOL;
-- Re-run v$latch query above and compare sleeps delta
Trade-off Setting CURSOR_SHARING = FORCE is a temporary relief measure, not a permanent fix. It causes Oracle to replace literals with system-generated bind variables, which can produce suboptimal plans for queries where the optimizer needs literal values for histogram-based selectivity. The correct permanent fix is to modify the application to use bind variables. Use FORCE only during the incident window, then audit and fix the SQL.

What Most DBAs Miss When Diagnosing Oracle High CPU Usage

The diagnostic step that consistently gets skipped is correlating the CPU spike timestamp with the AWR snapshot boundary. When Oracle CPU usage spikes in production, the instinct is to run live queries against V$ views. But if the spike already peaked and is subsiding, V$ views show the current state — not the peak state. AWR captures a snapshot every 60 minutes by default. The peak CPU event may have started and ended within a single AWR interval, leaving no clean before/after comparison.

The correct approach for any Oracle CPU spike that lasts more than 15 minutes is to immediately take a manual AWR snapshot at the peak:

SQL — Manual AWR Snapshot During Incident
-- Take snapshot immediately at peak CPU (preserves state)
EXEC DBMS_WORKLOAD_REPOSITORY.CREATE_SNAPSHOT();

-- After incident, generate AWR report between the two snapshots
-- Replace snap_id values with actual snapshot IDs
SELECT *
FROM TABLE(
  DBMS_WORKLOAD_REPOSITORY.AWR_REPORT_TEXT(
    l_dbid       => &dbid,
    l_inst_num   => 1,
    l_bid        => &begin_snap_id,
    l_eid        => &end_snap_id
  )
);

The second thing most engineers miss is checking V$OSSTAT for OS-level CPU breakdown. A high Oracle process CPU does not always mean Oracle is the problem — another OS process could be competing. V$OSSTAT shows the split between user CPU, system CPU, and idle time from the database's perspective, and it takes 30 seconds to check.

SQL — OS CPU Breakdown from Inside Oracle
SELECT stat_name,
       value,
       ROUND(value / 1e6, 2) cpu_seconds
FROM   v$osstat
WHERE  stat_name IN (
  'NUM_CPUS', 'NUM_CPU_CORES',
  'BUSY_TIME', 'IDLE_TIME',
  'USER_TIME', 'SYS_TIME'
)
ORDER BY stat_name;

Prevention: Monitoring Setup for Oracle CPU Spikes in 19c

1

Enable automatic AWR baselines for peak periods

A static AWR baseline captures the normal operating range for CPU and other metrics. Any deviation from baseline triggers a meaningful alert rather than a raw threshold alert that fires even during legitimate batch windows.

Run: EXEC DBMS_WORKLOAD_REPOSITORY.CREATE_BASELINE(start_snap_id => &normal_snap, end_snap_id => &normal_snap_end, baseline_name => 'NORMAL_PEAK');

2

Set a hard limit on PARALLEL_MAX_SERVERS

The default value is derived from cpu_count and is often too high for OLTP databases that occasionally run ad-hoc analytical queries. Set PARALLEL_MAX_SERVERS to cpu_count at startup and document the rationale. The most commonly skipped step: verifying this parameter survived a database restart after patching. Check it monthly.

3

Monitor hard parse rate daily via a scheduled query

Add a daily check of hard parse count as a percentage of execute count. Any day where hard parses exceed 10% of executes warrants investigation before it becomes a latch contention incident. Alert threshold: hard parse rate above 15% of execute count. Schedule this check to run at 08:00 daily and email results to the DBA team.

Frequently Asked Questions

How do I find which SQL is causing high CPU in Oracle 19c?
Query V$SQL ordered by CPU_TIME descending for cumulative CPU consumers, or use V$ACTIVE_SESSION_HISTORY filtering on SESSION_STATE='ON CPU' for real-time identification. ASH is the faster diagnostic during an active incident because it shows what is running right now, while V$SQL shows what has consumed the most CPU since cursor load. Both queries are in the diagnostic section above with exact column lists and correct-vs-wrong result interpretation.
Can parallel query cause Oracle high CPU usage on a well-sized host?
Yes, and it is more likely on a well-sized host than an undersized one. A host with 64 cores and PARALLEL_DEGREE_POLICY=AUTO can spawn 128 PX slaves for a single query. The OS scheduler cannot run 128 threads on 64 cores simultaneously, and the context-switching overhead itself burns CPU. A well-sized host gives Oracle more cores to misuse with AUTO parallelism. The fix is setting an explicit ceiling with PARALLEL_MAX_SERVERS and switching from AUTO to LIMITED or MANUAL degree policy for OLTP workloads.
What causes latch contention in Oracle 19c and how does it relate to CPU spikes?
Latch contention occurs when hundreds of sessions simultaneously compete for the same in-memory serialisation structure — most commonly the library cache latch protecting cursor memory, or the shared pool latch protecting memory allocation. Oracle sessions spin in a CPU-burning busy-wait before sleeping. High-frequency literal SQL that generates thousands of unique cursor versions is the most common trigger in 19c environments. The fix combines resizing the shared pool online, setting CURSOR_SHARING=FORCE as temporary relief, and fixing the application to use bind variables.
How do I apply oracle high cpu usage fix 19c without restarting the database?
All three fixes described in this guide are online operations that require no database restart. Kill the runaway session with ALTER SYSTEM KILL SESSION ... IMMEDIATE. Apply a SQL Profile to lock a good execution plan using DBMS_SQLTUNE.IMPORT_SQL_PROFILE. Limit parallel degree with ALTER SYSTEM SET PARALLEL_DEGREE_POLICY=MANUAL. Resize the shared pool with ALTER SYSTEM SET SHARED_POOL_SIZE=Xg SCOPE=BOTH. Each change takes effect within seconds. Confirm with the verification queries provided in each section above.

Want the Complete Oracle Performance Toolkit?

Drop a comment on this post with the keyword below and I will send you the AWR Analysis Walkthrough PDF — includes the full diagnostic SQL pack, execution plan interpretation guide, and the monitoring dashboard queries I use in production.

Comment: AWR
Oracle DBA Performance Tuning Oracle 19c Production DBA High Availability LevelUp Careers

No comments:

Post a Comment