Intel E7-8891 v2 CM8063601377422 User Manual

Product codes
Page of 504
Processor E7-2800/4800/8800 v2 Product Family
Datasheet Volume Two: Functional Description, February 2014
Reliability, Availability, Serviceability, and Manageability
Reliability, Availability, 
Serviceability, and 
This chapter describes RASM (Reliability, Availability, Serviceability and Manageability) 
features of the Intel Xeon processor E7 v2 product family. RASM refers to feature sets 
that are associated with system robustness and is defined as follows:
Reliability: System capability to detect errors, report errors, and correct errors. 
Reliability is typically measured in FITs (Failure in Time). 1 FIT = 1 error in 1 Billion 
Hours. Alternatively it can also be measured as MTBF (Mean Time Between Failure). 
1FIT is equivalent to approximately 114K Years of MTBF!.
Availability: System capability to redistribute (or reallocate) resources and maintain 
normal operation if an error event occurs. Availability is typically measured in “Uptime 
or downtime” over a given time interval, for example, 99.999% availability (also known 
as ‘Five 9’s). A “Five 9’s Available” system would assure an average “down time” of 5 
seconds over one year of continuous operation.
Serviceability: System capability to effectively report a failure and to expedite the 
servicing efforts. Incorporating serviceability features typically results in more efficient 
product maintenance, reduces operational costs, and results in minimizing the 
Manageability: System capability to monitor the health of the individual components, 
to predict failures, and to manage various resources. Incorporating manageability 
features typically results in maximizing the ‘uptime’ of a system.
The primary objective of this chapter is to cover the Reliability features of the processor 
since Availability, Serviceability, and Manageability features are generally applicable at 
system level. Any feature that also addresses the Availability, Serviceability, and 
Manageability will be clarified where applicable.
RASM Overview
Refer to 
 for a high level 
view of the processor’s key modules. Intel Xeon processor E7 v2 product family 
incorporates several features to address system reliability, availability, serviceability, 
and manageability requirements. In order to meet the target reliability requirements 
and to minimize the impact of the errors, processor incorporates various techniques 
such as parity, ECC, CRC, and redundancy within individual modules. Most of these 
errors are corrected by the built-in error correction logic and are called as “Corrected 
Errors”. Intel Xeon processor E7 v2 product family also incorporates Corrupt Data 
Containment and MCA Recovery
 features to minimize the impact due to certain errors 
known as ‘Uncorrected Recoverable” (UCR) errors. This section first describes the 
sources of errors, classifies the errors based upon processor’s error handling capability, 
and finally briefly documents all the available RASM features. Subsequent sections 
describe these RASM features in more detail.
Report Bug