This guide explains how Oracle Listener failures silently impact production systems
Oracle Listener Health Check – Production Monitoring Guide | Chetan Yadav
⏱️ Estimated Reading Time: 14 minutes
Oracle Listener Health Check
It’s 2 AM. Your phone lights up with alerts. Applications are down, dashboards are red,
and every connection attempt fails with TNS-12541: TNS:no listener.
The database is up — but the business is still dead.
In real production environments, a failed Oracle Listener can block thousands of users,
cause SLA breaches, and trigger revenue loss within minutes.
We’ve seen P99 login latency jump from milliseconds to total outages.
This guide shows how to implement a production-grade Oracle Listener health check
using scripts, monitoring logic, and automation — before the listener becomes your single
point of failure.
Oracle 9i on Solaris Case Study – A Deep Dive Retro
⏱️ Estimated Reading Time: 12-13 minutes
Oracle 9i on Solaris Case Study (A Deep Dive Retro)
In the early 2000s, database outages were not masked by load balancers,
auto-scaling groups, or managed cloud services.
If an Oracle database went down on Solaris,
the business felt it immediately.
There was no RAC, no ASM, no cloud snapshots.
Every failure meant manual diagnosis,
filesystem checks, kernel tuning, and long recovery windows.
Downtime translated directly into revenue loss and SLA breaches.
This case study revisits a real-world Oracle 9i on Solaris production environment,
breaking down architecture decisions, tuning practices, failures,
and the lessons modern DBAs can still apply today.
In a production Oracle Real Application Clusters (RAC) environment, stability is everything. A single corrupt OCR, a missing Voting Disk, or an unstable CRS stack can lead to node evictions (split-brain scenarios) and unplanned downtime.
This article provides a comprehensive Shell Script for RAC Cluster Health Audits. It covers CRS status, OCR integrity, Voting Disk validation, and resource stability checks—perfect for daily monitoring or pre-patching validation.
Table of Contents
Why You Must Audit RAC Cluster Health Daily
Production-Ready RAC Health Check Script (Shell)
Script Output & Analysis Explained
Critical Components: OCR, Voting Disk & CRS
Troubleshooting Common RAC Issues
How to Automate This Audit (Cron)
Interview Questions: RAC Troubleshooting
Final Summary
FAQ
About the Author
1. Why You Must Audit RAC Cluster Health Daily
Oracle RAC relies on a complex stack of clusterware services. Neglecting these checks leads to:
Node Evictions: Caused by heartbeat failures or voting disk I/O timeouts.
OCR Corruption: Resulting in the inability to start the clusterware stack.
Resource Regressions: Services or VIPs flapping between nodes.
Split-Brain Syndrome: Where nodes lose communication and fight for control.
Running a unified audit script ensures you catch "INTERMEDIATE" or "OFFLINE" states before they become outages.
2. Production-Ready RAC Health Check Script
This shell script checks the core pillars of RAC stability: CRS Stack, OCR, Voting Disks, and Resource Status.
Note: Execute this script as the grid (or root) user.
Prepare for these common questions during senior DBA interviews:
Q: What is a split-brain scenario in RAC? A: When nodes lose private network communication and both try to write to the database. Voting disk prevents this by fencing off one node.
Q: How do you backup OCR? A: Oracle automatically backs up OCR every 4 hours. You can also manually backup using `ocrconfig -manualbackup`.
Q: What command checks the private interconnect IPs? A: `oifcfg getif`.
8. Final Summary
A healthy RAC cluster requires vigilant monitoring of the clusterware stack, not just the database instances. The script provided above is a fundamental tool for checking CRS, OCR, and Voting Disk health instantly.
Use this script as part of your Weekly Health Check routine (as suggested in the Nov 2025 schedule) to ensure 99.999% availability.
9. FAQ
Q1: Can I run this script as the 'oracle' user? A: Most `crsctl` check commands work, but `ocrcheck` and deep diagnostics usually require `grid` or `root` privileges.
Q2: What should I do if OCR check fails? A: Restore from the latest automatic backup using `ocrconfig -restore`. Do not restart the stack until resolved.
Q3: Does this cause performance impact? A: No, these are lightweight metadata queries.
About the Author
Chetan Yadav is a Senior Oracle, PostgreSQL, MySQL and Cloud DBA with 14+ years of experience supporting high-traffic production environments across AWS, Azure and on-premise systems. His expertise includes Oracle RAC, ASM, Data Guard, performance tuning, HA/DR design, monitoring frameworks and real-world troubleshooting.
He trains DBAs globally through deep-dive technical content, hands-on sessions and automation workflows. His mission is to help DBAs solve real production problems and advance into high-paying remote roles worldwide.
Call to Action If you found this helpful, follow my blog and LinkedIn for deep Oracle, MySQL, and RAC content. I publish real production issues, scripts, and monitoring guides to help you level up your DBA career.
Monitoring is the backbone of every production database. SQL-level monitoring helps DBAs detect slow queries, high load SQLs, blocked sessions, and performance issues before they become incidents.
This article provides a production-ready SQL monitoring script, explains each part of it, and shows how DBAs can integrate it into daily checks.
Table of Contents
1. What Does SQL Monitoring Mean for DBAs?
2. Production-Ready SQL Monitoring Script
3. Script Output Explained
4. Performance Metrics You Must Watch
5. Real-World DBA Scenario
6. Automation Options
7. FAQ
8. Final Thoughts
1. What Does SQL Monitoring Mean for Oracle DBAs?
SQL monitoring helps identify:
High buffer gets SQL
High CPU-consuming SQL
SQL with long execution time
SQL causing row locks
SQL creating IO bottlenecks
SQL generating alerts or session waits
These insights allow proactive tuning and avoid outages.
2. Production-Grade Oracle SQL Monitoring Script
Below is a clean, ready-to-run script used by senior DBAs in real production environments:
SET LINES 200SET PAGES 200SELECT
s.sid,
s.serial#,
s.username,
s.status,
p.spid AS os_pid,
q.sql_id,
q.sql_text,
ROUND(s.cpu_time/1000000,2) AS cpu_sec,
s.blocking_session,
s.event,
s.wait_class,
s.last_call_et AS seconds_since_last_call
FROM v$session s
LEFT JOIN v$sql q ON s.sql_id = q.sql_id
LEFT JOIN v$process p ON s.paddr = p.addr
WHERE s.username ISNOTNULLORDERBY s.cpu_time DESCFETCH FIRST20ROWSONLY;
3. What This Script Reveals
Each column has a purpose:
Column
Meaning
SID / SERIAL#
Unique session ID
USERNAME
Which user is running SQL
STATUS
ACTIVE / INACTIVE
OS_PID (spid)
OS-level process identifier
SQL_ID
Query identifier
SQL_TEXT
Actual SQL query
CPU_SEC
CPU usage of query
BLOCKING_SESSION
Detect blocking
EVENT
What the session is waiting on
WAIT_CLASS
IO, CPU, Network, etc.
LAST_CALL_ET
Query running time
This helps diagnose real production issues in minutes.
For official Oracle reference, visit the Oracle Database Documentation:
https://docs.oracle.com/en/database/
4. Add-on: Top SQL by Buffer Gets
SELECT sql_id,
buffer_gets,
executions,
ROUND(buffer_gets/DECODE(executions,0,1,executions)) AS gets_per_exec,
sql_text
FROM v$sqlORDERBY buffer_gets DESCFETCH FIRST10ROWSONLY;
Use this to find high-cost SQL.
5. Add-on: Top SQL by IO
SELECT sql_id,
disk_reads,
direct_writes,
sql_text
FROM v$sqlORDERBY disk_reads DESCFETCHFIRST10ROWSONLY;
Useful for IO-heavy systems and slow storage layers.
6. Real-World DBA Use Case
Example scenario:
Application suddenly slows down
Sessions start waiting on "db file sequential read"
Your script shows high CPU session + SQL_ID
SQL belongs to a reporting job running full table scans
You tune it (add index / rewrite SQL / fix stats)
Performance normalizes in seconds
This is exactly what impresses recruiters and decision-makers.
7. How To Automate These Checks
DBAs usually automate this:
Add SQL scripts to a shell script
Run at intervals using cron or OEM
Forward output to email / Teams / Slack
Integrate with CloudWatch / SignalFx
Store SQL history for future tuning
I can prepare your automation version in bash or PowerShell if you want.
8. Interview Questions Based on This Topic
Expect these in interviews:
How do you identify top CPU SQL?
What is SQL_ID and why is it important?
Difference between buffer gets and disk reads?
How will you find blocking sessions?
Which wait events show IO bottlenecks?
9. Final Summary
A good SQL monitoring script gives DBAs visibility into session health, CPU load, IO bottlenecks, and blocking in real time. This script is production-ready and can be used for performance tuning, health checks, or automated monitoring.
FAQ – Oracle SQL Monitoring
Q1: What is SQL monitoring in Oracle? SQL monitoring helps identify CPU-heavy SQLs, IO waits, blocking sessions, and real-time performance problems.
Q2: Which Oracle versions support this script? The script works on 11g, 12c, 18c, and 19c as long as v$session and v$sql views are available.
Q3: When should DBAs run this script? During slow application response, high CPU alerts, IO bottlenecks, blocking chains, or performance degradation.
Q4: Can this script be automated? Yes—using cron, OEM, Python, or n8n workflows for scheduled reporting.
Chetan Yadav is a Senior Oracle, PostgreSQL, MySQL & Cloud DBA with 14+ years of experience supporting high-traffic production environments across AWS, Azure, and on-premise systems. His core expertise includes Oracle RAC, ASM, Data Guard, performance tuning, HA/DR design, monitoring frameworks, and real-world troubleshooting.
He also trains DBAs globally through deep-dive technical content, hands-on sessions, and automation workflows using n8n, AI tools, and modern monitoring stacks. His mission is to help DBAs solve real production problems and grow into high-paying remote roles worldwide.
Chetan regularly publishes expert content across Oracle, PostgreSQL, MySQL, and Cloud DBA technologies—including performance tuning guides, DR architectures, monitoring tools, scripts, and real incident-based case studies.
These platforms feature guides, scripts, diagrams, troubleshooting workflows, and real-world DBA case studies designed for database professionals worldwide.