Справочник Пользователя для Intel SE7520JR2

Скачать
Страница из 225
Intel® Server Board SE7520JR2 
Error Reporting and Handling 
Revision 1.0 
 
 
C78844-002 
149
6.  Error Reporting and Handling 
This section defines how errors are handled. Also discussed is the role of the BIOS in error 
handling and the interaction between the BIOS, platform hardware, and server management 
firmware with regard to error handling. In addition, error-logging techniques are described and 
beep codes and POST messages are defined.  
Note: The generic term “BMC” may be used throughout this secton when a feature and/or 
function being described is common to both the mBMC and the Sahalee BMC. If a described 
feature or function is unique, the specific management controller will be referenced. 
6.1  Fault Resilient Booting (FRB) 
Fault Resilient Booting (FRB) is a set of BIOS and BMC algorithms and hardware support that 
allow a multiprocessor system to boot in case of failure of the bootstrap processor (BSP) under 
certain conditions.  FRB functionality will differ depending on whether standard onboard 
platform instrumentation is used (mBMC) or whether an Intel Management Module is used. 
With on-board platform instrumentation, should a processor failure be detected during POST, 
the mBMC does not have the ability to disable the failed or failing processor.  Therefore the 
system may or may not continue to boot. A FRB-2 error will be generated to the System Event 
Log (SEL) and an error will be displayed at POST. FRB2 is a BIOS-based algorithm that uses 
the mBMC IPMI watchdog timer to protect against BIOS hangs during the POST process 
On systems that have an Intel Management Module installed, several different levels of FRB are 
supported: FRB1, FRB2, FRB3, and OS Watchdog Timer. The FRB algorithms detect BSP 
failures and take steps to disable that processor and reset the system so another processor will 
run as the BSP.   
6.1.1 
FRB1 – BSP Self-Test Failures 
The BIOS provides an FRB1 timer.  Early in POST, the BIOS checks the Built-in Self Test 
(BIST) results of the BSP.  If the BSP fails BIST, the BIOS requests the Sahalee BMC to disable 
the BSP. The Sahalee BMC disables the BSP, selects a new BSP and generates a system 
reset.  If there is no alternate processor available, the Sahalee BMC generates a beep code and 
halts the system.  If the Sahalee BMC is not installed, then BIOS can only notify the user that 
the BIST failed; no processors will be disabled. 
The BIST failure is displayed during POST and an error is logged to the SEL. 
6.1.2 
FRB2 – BSP POST Failures 
A second timer (FRB2) is set to several minutes by BIOS and is designed to guarantee that the 
system completes POST. The FRB2 timer is enabled just before the FRB3 timer is disabled to 
prevent any “unprotected” window of time. Near the end of POST, the BIOS disables the FRB2 
timer. If the system contains more than 1 GB of memory and the user chooses to test every 
DWORD of memory, the watchdog timer is extended before the extended memory test starts, 
because the memory test can exceed the timer duration. The BIOS will also disable the 
watchdog timer before prompting the user for a boot password.  If the system hangs during 
POST, before the BIOS disables the FRB2 timer, the Sahalee BMC generates an asynchronous