Cisco Cisco IP Contact Center Release 4.6.2 Maintenance Manual

Page of 92
Architecture
The architecture of the system software allows the system to continue to function if one
component fails. This ability is called fault tolerance. To ensure that the system software
continues to operate in the case of a computer failure, all critical parts of the system can be
physically duplicated. There can be two or more physical Network Interface Controllers (NICs),
two physical Peripheral Gateways (PGs) at each call center, and two Central Controllers. The
communication paths between critical components can also be duplicated.
The critical components of the system software include the Central Controller (CallRouter and
Logger), PGs, and NICs. Normal non-HDS Administration & Data Servers and Administration
Clients are not considered to be critical to the operation of the system since they play no active
role in routing calls or storing historical data.
When both sides of a component (that is, Side A and Side B) are available to the system, that
component is said to be duplexed; when only one of the pair is available, the component is
running by itself, if it is set up as duplexed. You might have some components in your Unified
ICM system that are duplexed and others that are simplexed. For example, you might have a
duplexed Central Controller (two CallRouters and two Loggers) and simplexed Peripheral
Gateways (in lab environment only) at call center sites.
It takes more than duplicate hardware to achieve fault tolerance. The Unified ICM system can
quickly detect that a component has failed, bypass that component, and use its duplicate instead.
The system software can also initiate diagnostics and service so that the failed component can
be fixed or replaced and the system returned to duplexed operation.
Approaches to Fault Tolerance
The system software uses two approaches to fault tolerance: hot standby and synchronized
execution. In the hot standby approach, one set of processes is called the primary, and the other
is called the backup. In this model, the primary process performs the work at hand while the
backup process is idle. In the event of a primary process failure, the backup process is activated
and takes over. Peripheral Gateways optionally use the hot standby approach to fault tolerance.
The system software uses synchronized execution in the Central Controller. In the synchronized
execution approach, all critical processes (CallRouter and Logger) are duplicated on separate
computers. There is no concept of primary or backup. Both process sets run in a synchronized
fashion, processing duplicate input and producing duplicate output. Each synchronized system
is an equal peer. Each set of peers is a synchronized process pair.
In the event that one of the synchronized processes fails (for example, a CallRouter goes off-line),
its peer continues to run. There is no loss of data and calls continue to be routed. When the
failed member of the pair returns to operation, it is resynchronized with its peer and begins to
Administration Guide for Cisco Unified ICM/Contact Center Enterprise & Hosted Release 8.x
16
Chapter 2: Fault Tolerance
Architecture