IBM 520Q User Manual

Page of 110
82
 
IBM System p5 520 and 520Q Technical Overview and Introduction
If the output shows CPU Guard as disabled, enter the following command to enable it:
chdev -l sys0 -a cpuguard='enable'
Cache or cache-line deallocation is aimed at performing dynamic reconfiguration to bypass 
potentially failing components. This capability is provided for both L2 and L3 caches. Dynamic 
run-time deconfiguration is provided if a threshold of L1 or L2 recovered errors is exceeded.
In the case of an L3 cache run-time array single-bit solid error, the spare resources are used 
to perform a line delete on the failing line. 
PCI hot-plug slot fault tracking helps prevent slot errors from causing a system machine 
check interrupt and subsequent reboot. This provides superior fault isolation, and the error 
affects only the single adapter. Run-time errors on the PCI bus caused by failing adapters 
result in recovery action. If this is unsuccessful, the PCI device is shut down gracefully. Parity 
errors on the PCI bus itself result in bus retry, and if uncorrected, the bus and any I/O 
adapters or devices on that bus are deconfigured.
The p5-520 or p5-520Q supports PCI Extended Error Handling (EEH), if it is supported by the 
PCI-X adapter. In the past, PCI bus parity errors caused a global machine check interrupt, 
which eventually required a system reboot in order to continue. In the p5-520 or p5-520Q 
system, hardware, system firmware, and AIX 5L interaction have been designed to allow 
transparent recovery of intermittent PCI bus parity errors and graceful transition to the I/O 
device available state in the case of a permanent parity error in the PCI bus.
EEH-enabled adapters respond to a special data packet generated from the affected PCI slot 
hardware by calling system firmware, which examines the affected bus, allows the device 
driver to reset it, and continues without a system reboot.
Persistent deallocation functions include:
򐂰
Processor
򐂰
Memory
򐂰
Deconfigure or bypass failing I/O adapters
򐂰
L3 cache
Following a hardware error that has been flagged by the service processor, the subsequent 
reboot of the system invokes extended diagnostics. If a processor or L3 cache is marked for 
deconfiguration by persistent processor deallocation, the boot process attempts to proceed to 
completion with the faulty device deconfigured automatically. Failing I/O adapters are 
deconfigured or bypassed during the boot process.
3.1.8  Serviceability
Increasing service productivity means the system is up and running for a longer time. The 
p5-520 and p5-520Q improve service productivity by providing the functions described in the 
following sections.
Error indication and LED indicators
The p5-520 and p5-520Q are designed for client setup of the machine and for the subsequent 
addition of most hardware features. The p5-520 and p5-520Q also allow clients to replace 
service parts (Client Replaceable Unit). To accomplish this, the p5-520 or p5-520Q provides 
Note: The auto-restart (reboot) option, when enabled, can reboot the system automatically 
following an unrecoverable software error, software hang, hardware failure, or 
environmentally induced failure (such as a loss of the power supply).