Cisco Cisco Content Delivery Engine 250 Guida All'Installazione

Pagina di 14
 
 
 
 
 
 
1  Background 
 
Current CDE250 systems utilize an Intelligent Platform Management Interface (IPMI) 
infrastructure that monitors and manages the health of the system. This is implemented using 
embedded IPMI firmware running within the Baseboard Management Controller (BMC) system 
chip on the motherboard.  The current version of IPMI firmware used on the CDE250 is v2.05. 
 
There is currently one (1) known issue associated with the IPMI / BMC subsystem and the IPMI 
FW version 2.05 and one (1) known cosmetic issue with the kipmi process. 
1.1 sdt  (Superdoctor)  becomes  non-responsive  with  kipmi0 
reporting High CPU utilization. 
 
Technical analysis: 
  After investigate with Linux kipmi implementation, kipmi by design is effectively a very 
fast polling loop. When an IPMI command has been issued to the controller, kipmi0 polls 
for completion of that command, and then retrieves the bytes of the answer.  
Unfortunately, the hardware interface (KCS) has no interrupt mechanism that could be 
used to avoid polling. So in the case where there are a lot of IPMI commands executing, 
we will observe the High CPU utilization symptom (Especially when BMC response is 
slow). It is KCS protocol behavior.   
  However, kipmi0 is a low priority process, so it will not impact the system though we 
observe high CPU utilization. 
  While observing High CPU utilization for kipmi0, after Linux profiling, we confirmed it 
is busy on IO access.  This observed behavior is in line with the fast polling loop 
behavior of kipmi. 
  If an IPMI command is issued to the BMC and the BMC is currently in a reset state the 
BMC will never acknowledge the command and the IPMI process kipmi0 will be stuck in 
a polling loop until the system is reset.   
  The Technical team found that there was an improvement to be made to IPMI FW with 
BMC and IPMI interaction.  The IPMI FW v3.03 has a fix that requires the BMC to 
acknowledge pending commands after a BMC reset has occurred.  This fix will correct 
this issue of the IPMI kipmi0 process being stuck in the polling process waiting for the 
acknowledgement to occur.   
 
Root Case: IPMI FW versions older than v3.03 do not require the BMC to respond or clear all 
IPMI commands upon completion of BMC reset.  
 
Solution: Update the IPMI firmware to v3.03.  Firmware v3.03 contains the fix to correct this 
kipmi polling/BMC reset hang issue.  We have validated this new firmware internally at Cisco.