Справочник Пользователя для Intel SE7520JR2

Скачать
Страница из 225
Intel® Server Board SE7520JR2 
Error Reporting and Handling 
Revision 1.0 
 
 
C78844-002 
155
6.3 Error 
Logging 
This section defines how errors are handled by the system BIOS. Also discussed is the role of 
the BIOS in error handling and the interaction between the BIOS, platform hardware, and server 
management firmware with regard to error handling. In addition, error-logging techniques are 
described and beep codes for errors are defined.  
One of the major requirements of server management is to correctly and consistently handle 
system errors. System error sources can be categorized as follows:  
• 
PCI bus 
• 
Memory multi-bit errors (single-bit errors are not logged) 
• 
Sensors 
• 
Processor internal errors, bus/address errors, thermal trip errors, temperatures and 
voltages, and GTL voltage levels 
• 
Errors detected during POST, logged as POST errors 
Sensors are managed by the mBMC.  The mBMC is capable of receiving event messages from 
individual sensors and logging system events 
6.3.1 SMI 
Handler 
The SMI handler handles and logs system-level events that are not visible to the server 
management firmware. If SEL error logging is disabled in the BIOS Setup utility, no SMI signals 
are generated on system errors. If error logging is enabled, the SMI handler preprocesses all 
system errors, even those that are normally considered to generate an NMI.  
The SMI handler sends a command to the BMC to log the event and provides the data to be 
logged. For example, The BIOS programs the hardware to generate an SMI on a single-bit 
memory error and logs the location of the failed DIMM in the system event log.  
6.3.2 
PCI Bus Error 
The PCI bus defines two error pins, PERR# and SERR#, for reporting PCI parity errors and 
system errors, respectively.  The BIOS can be instructed to enable or disable reporting the 
PERR# and SERR# through NMI.  Disabling NMI for PERR# and/or SERR# also disables 
logging of the corresponding event.  In the case of PERR#, the PCI bus master has the option 
to retry the offending transaction, or to report it using SERR#.  All other PCI-related errors are 
reported by SERR#.  All the PCI-to-PCI bridges are configured so that they generate a SERR# 
on the primary interface whenever there is a SERR# on the secondary side, if SERR# has been 
enabled through Setup.  The same is true for PERR#.   
6.3.3 
Processor Bus Error 
If the chipset supports ECC on the processor bus then the BIOS enables the error correction 
and detection capabilities of the processors by setting appropriate bits in the processor model 
specific register (MSR) and appropriate bits inside the chipset.   
In the case of irrecoverable errors on the host processor bus, proper execution of the 
asynchronous error handler (usually SMI) cannot be guaranteed and the handler cannot be 
relied upon to log such conditions.  The handler will record the error to the SEL only if the 
system has not experienced a catastrophic failure that compromises the integrity of the handler.