Справочник Пользователя для Intel SE7520JR2

Скачать
Страница из 225
Functional Architecture 
Intel® Server Board SE7520JR2 
 
 
Revision 1.0 
C78844-002 
42 
Uncorrectable memory errors are critical errors that may cause the system to fail. The BIOS 
normally detects and logs these errors as IPMI SEL events for all management levels, except in 
the case described below. 
It is possible that a critical hardware error (uncorrectable memory or bus error) may prevent the 
BIOS from running, reporting the error, and restarting the system. In Professional and Advanced 
management models, the Sahalee BMC monitors the SMI signal, which, if it stays asserted for a 
long period of time, is an indication that BIOS cannot run. In this case, the Sahalee BMC logs 
an SMI Timeout event and probes for errors.  If one is found it will log data against the IPMI type 
0Ch Memory Sensor and will log against the IPMI 13h Critical Interrupt sensor for a bus error.  
Both of these can include additional data in bytes 2 and 3 depending on the exact nature of the 
error and what the chipset reports to the Sahalee BMC. 
3.3.6 
Memory RASUM Features 
The Intel E7520 MCH supports several memory RASUM (Reliability, Availability, Serviceability, 
Usability, and Manageability) features.  These features include the Intel® x4 Single Device Data 
Correction (x4 SDDC) for memory error detection and correction, Memory Scrubbing, Retry on 
Correctable Errors, Integrated Memory Initialization, DIMM Sparing, and Memory Mirroring.  The 
following sections describe how each is supported. 
Note: The operation of the memory RASUM features listed below is supported regardless of the 
platform management model used. However, with no Intel® Management Module installed, the 
system has limited memory monitoring and logging capabilities. It is possible for a RASUM 
feature to be initiated without notification that the action has occurred when standard Onboard 
Platform Instrumentation is used. 
3.3.6.1 
DRAM ECC – Intel® x4 Single Device Data Correction  (x4 SDDC) 
The DRAM interface uses two different ECC algorithms. The first is a standard SEC/DED ECC 
across a 64-bit data quantity. The second ECC method is a distributed, 144-bit S4EC-D4ED 
mechanism, which provides x4 SDDC protection for DIMMS that utilize x4 devices. Bits from x4 
parts are presented in an interleaved fashion such that each bit from a particular part is 
represented in a different ECC word. DIMMs that use x8 devices, can use the same algorithm 
but will not have x4 SDDC protection, since at most only four bits can be corrected with this 
method. The algorithm does provide enhanced protection for the x8 parts over a standard SEC-
DED implementation. With two memory channels, either ECC method can be utilized with equal 
performance, although single-channel mode only supports standard SEC/DED. 
When memory mirroring is enabled, x4 SDDC ECC is supported in single channel mode when 
the second channel has been disabled during a fail-down phase. The x4 SDDC ECC is not 
supported during single-channel operation outside of DIMM mirroring fail-down as it does have 
significant performance impacts in that environment.  
3.3.6.2 
Integrated Memory Scrub Engine 
The Intel E7520 MCH includes an integrated engine to walk the populated memory space 
proactively seeking out soft errors in the memory subsystem. In the case of a single bit 
correctable error, this hardware detects, logs, and corrects the data except when an incoming 
write to the same memory address is detected. For any uncorrectable errors detected, the scrub