Intel E7-8891 v2 CM8063601377422 User Manual

Product codes
Page of 504
Processor E7-2800/4800/8800 v2 Product Family
Datasheet Volume Two: Functional Description, February 2014
Reliability, Availability, Serviceability, and Manageability
Memory Migration
The Intel Xeon processor E7 v2 product family will provide support for migration of 
memory to a spare FRU. Only one migration target in the system will be supported at a 
time which means that there will only be one master home and one slave home in the 
IIO RAS Overview
The IIO module RAS features aim to achieve the following:
• Error Containment
• PCI Express soft, uncorrectable error detection and recovery on links
IIO Module Error Reporting
The IIO module logs and reports the detected errors via “system event” generations. In 
the context of error reporting, a system event is an event that notifies the system of 
the error. Two types of system events can be generated -- an inband message to the 
CPU, and/or out-of-band signaling to the platform. In the case of inband messaging, 
the CPU is notified of the error by the inband message (interrupt, failed response, and 
so forth). Out-of-band signaling (Error Pins) informs an external agent of the error 
events. An external agent such as BMC may collect the errors from the error pins to 
determine the health of the system and sends interrupts to CPU accordingly.
Error Severity Classification
In the IO module, errors are classified into three severities: Correctable, Uncorrectable, 
Fatal. This classification separates those errors resulting in functional failures from 
those errors resulting in degraded performance or errors resulting in system resets.
Correctable Errors (Severity 0 Error)
Hardware correctable errors include those error conditions where the system can 
recover without any loss of information. Hardware corrects these errors and no 
software intervention is required.
Recoverable Errors (Severity 1 Error)
Recoverable errors are software correctable or software/hardware uncorrectable errors 
which cause a particular transaction to be unreliable but the system hardware is 
otherwise fully functional. Isolating recoverable from fatal errors provides system 
management software the opportunity to recover from the error without reset and 
disturbing other transactions in progress. Devices not associated with the transaction in 
error are not impacted by the error. Software Correctable Errors
Software correctable errors are considered “recoverable” errors. These errors include 
those error conditions where the system can recover without any loss of information. 
Software intervention is required to correct these errors.
Report Bug