Introduction
Oracle Exadata is an engineered system designed for high-performance database workloads. Regular health checks help identify issues before they impact database availability, performance, or storage operations.
This guide provides essential health check commands for Exadata Database Servers, Storage Servers (Cells), and InfiniBand Network components.
Environment
Example Environment:
|
Component |
Hostname |
|
Database Server 1 |
db01 |
|
Database Server 2 |
db02 |
|
Storage Server 1 |
cell01 |
|
Storage Server 2 |
cell02 |
|
Storage Server 3 |
cell03 |
1. Verify Exadata Software Versions
Login to Storage Server:
cellcli
Check Cell Software Version:
list cell attributes releaseVersion
Example Output:
21.2.18.0.0
2. Check Cell Status
list cell detail
Verify:
- Status = online
- Flash Cache = active
- Grid Disks = active
3. Verify Physical Disks
list physicaldisk attributes name,status
Expected:
normal
Investigate immediately if:
warning
predictive failure
critical
appears.
4. Verify Grid Disks
list griddisk attributes name,status
Expected:
active
5. Verify Cell Disks
list celldisk attributes name,status
Expected:
normal
6. Check Flash Cache Status
list flashcache detail
Verify:
- CacheStatus = normal
- No degraded state
7. Check Flash Log Status
list flashlog detail
Confirm Flash Log is enabled and operational.
8. Verify ASM Disk Groups
Database Server:
sqlplus / as sysasm
Execute:
SELECT
name,
state,
type,
total_mb,
free_mb
FROM v$asm_diskgroup;
Verify:
- Mounted
- Adequate free space
9. Check ASM Disk Status
SELECT
group_number,
disk_number,
name,
path,
header_status,
mode_status,
state
FROM v$asm_disk;
Expected:
HEADER_STATUS = MEMBER
STATE = NORMAL
10. Verify Database Cluster Status
crsctl stat res -t
All resources should be:
ONLINE
11. Check Database Instances
srvctl status database -d <database_name>
Example:
srvctl status database -d PRODDB
Expected:
Instance PROD1 is running
Instance PROD2 is running
12. Verify InfiniBand Status
Database Server:
ibstat
Verify:
State: Active
Physical state: LinkUp
13. Check InfiniBand Switch Connectivity
ibhosts
Ensure all hosts and storage cells are visible.
14. Verify Exadata Services
Storage Server:
imageinfo
Verify software image version.
15. Check Critical Alerts
list alerthistory
Recent alerts should be reviewed daily.
Filter Critical Alerts:
list alerthistory where severity='critical'
16. Review Hardware Faults
list physicaldisk where status != 'normal'
list celldisk where status != 'normal'
list griddisk where status != 'active'
Any output requires investigation.
17. Verify Cell Services
cellcli -e "list cell detail"
Ensure:
- MS Service Running
- RS Service Running
- CELLSRV Running
18. Exachk Health Assessment
Run:
exachk
or
./exachk
Review:
FAIL
WARNING
INFO
sections carefully.
Daily DBA Health Check Checklist
|
Check |
Status |
|
CRS Resources Online |
✓ |
|
ASM Disk Groups Mounted |
✓ |
|
Cell Status Online |
✓ |
|
Grid Disks Active |
✓ |
|
Cell Disks Normal |
✓ |
|
Flash Cache Healthy |
✓ |
|
InfiniBand Active |
✓ |
|
No Critical Alerts |
✓ |
|
Database Instances Running |
✓ |
Common Issues
Cell Disk Offline
list celldisk
Investigate hardware and storage alerts immediately.
ASM Disk Missing
select name,state,path from v$asm_disk;
Check storage server status and ASM alerts.
InfiniBand Link Down
ibstat
Expected:
State: Active
Any other status should be investigated.
Conclusion
Regular Exadata health checks help identify storage, ASM, network, and cluster issues before they affect production workloads. Incorporating these commands into daily operational checks significantly improves platform stability and reduces unplanned outages.
A disciplined health check process is one of the most important responsibilities of an Exadata DBA.
No comments:
Post a Comment