Cisco Cisco ASR 5500 故障排查指南

下载
页码 7
operation, the system reboots itself.
Session managers also maintain statistics for each Access Point Name (APN), Services, functionalites, and so
on that will be permanently lost when a crash occurs. Therefore an external entity that collects bulkstats
periodically will observe a dip in the statistics when one or more crashes occur. This can manifest as a dip in a
graphical representation of the statistics drawn over a time axis.
Note
: A typical chassis populated with 7-14 PSC or 4-10 DPC cards has about 120-160 session managers,
dependent upon the number of PSC/DPC cards, and a single crash will result in the loss of about 1/40
th
 or
1/80
th
 of the statistics. When a standby session manager takes over, it begins to accumulate the statistics again
from zero.
How to know if a crash occurred?
A crash will trigger an SNMP trap event to a network monitoring station, such as the Event Monitoring
Service (EMS) and by syslog events. The crashes that have occurred in the system can also be observed with
the show crash list command. Note that this command lists both unexpected and expected crash events as
described earlier. These two types of crash events can be distinguished by means of a header that describes
each crash.
A task crash followed by successful session recovery is indicated by this log message:
"Death notification of task <name>/<instance id> on <card#>/<cpu#> sent to parent
 task <parent name>/<instance id> with failover of <task name>/<instance id>
 on <card#>/<cpu#>"
A task crash that could not recover is indicated by this log message:
"Death notification of task <name>/<instance id> on <card#>/<cpu#> sent to parent
 task <parent name>/<instance id>"
In summary, with session recovery enabled, in most cases the crashes will not be noticed because they have
no subscriber impact. One has to enter the CLI command, or look at the logs or SNMP notification in order to
detect any occurrence of crashes.
For example:
 ******** show crash list *******
Tuesday May 26 05:54:14 BDT 2015
=== ==================== ======== ========== =============== =======================
#           Time         Process  Card/CPU/        SW          HW_SER_NUM
                                     PID         VERSION       MIO / Crash Card
=== ==================== ======== ========== =============== =======================
1   2015-May-07+11:49:25 sessmgr  04/0/09564 17.2.1          SAD171600WS/SAD172200MH
2   2015-May-13+17:40:16 sessmgr  09/1/05832 17.2.1          SAD171600WS/SAD173300G1
3   2015-May-23+09:06:48 sessmgr  03/1/31883 17.2.1          SAD171600WS/SAD1709009P
4   2015-May-25+15:58:59 sessmgr  09/1/16963 17.2.1          SAD171600WS/SAD173300G1
5   2015-May-26+01:15:15 sessmgr  04/0/09296 17.2.1          SAD171600WS/SAD172200MH
 ******** show snmp trap history verbose *******
 Fri May 22 19:43:10 2015 Internal trap notification 1099 (ManagerRestart) facility
 sessmgr instance 204 card 9 cpu 1 
 Fri May 22 19:43:29 2015 Internal trap notification 73 (ManagerFailure) facility
 sessmgr instance 204 card 9 cpu 1 
 Fri May 22 19:43:29 2015 Internal trap notification 150 (TaskFailed) facility
 sessmgr instance 204 on card 9 cpu 1
 Fri May 22 19:43:29 2015 Internal trap notification 151 (TaskRestart) facility