IBM 150 Manual De Usuario

Descargar
Página de 286
56
 
RS/6000 43P 7043 Models 150 and 260 Handbook
3.1.2  Reliability, Availability, and Serviceability (RAS)
Following are the features that provide the IBM RS/6000 Model 150 reliability, 
availability, and serviceability.
3.1.2.1  Reliability, Fault Tolerance, and Data Integrity
The reliability of the Model 150 system starts with reliable components, 
devices, and subsystems. During the design and the development process, 
subsystems go through rigorous verification and integration testing 
processes. During system manufacturing, systems go through a testing 
process to ensure the highest product quality level.
The Model 150 system memory offers ECC (Error-Checking and Correcting) 
fault-tolerant features. ECC corrects environment-induced single-bit 
intermittent memory failures. As well as single hard failures. With ECC, the 
majority of memory failures will not impact system operation. ECC also 
provides double-bit memory error detection which protects data integrity in 
the event of the double-bit memory failures. The system bus and PCI buses 
are designed with parity error detection. 
Disk mirroring and disk controller duplexing capability are provided by the AIX 
operating system. 
The journaled file system (JFS) of AIX operating system maintains file system 
consistency and prevents data loss when the system is abnormally halted 
due to a power failures.
An available RAID hardware feature external to the system provides data 
integrity and fault tolerance in the event of the disk failure.
3.1.2.2  Fault Monitoring Functions
Following are the functions used to monitor faults during the boot process.
  • POST (Power-on-Self Test) that checks the processor, L2 cache, memory 
and associated hardware that are required for proper booting of the 
operating system every time the system is powered on. If a non-critical 
error is detected, or if the error(s) occur in the resources that can be 
removed from the system configuration, the booting process will proceed 
to completion. The error(s) are logged in the system Non Volatile RAM.
  • Disk drive fault tracking is a facility that can alert the system administrator 
of an impending disk failure before it impacts customer operation.
  • AIX log facility where hardware and software failures are recorded and 
analyzed (by the Error Log Analysis routine) to provide warning to the 
system administrator on the causes of system problems. This also