Cisco Cisco MediaSense Release 9.1(1) Guia Do Desenho

High Availability

Cisco MediaSense implements a redundant, highly available architecture. Under normal operation, all deployed servers are always fully
active. The following sections describe various aspects of this design.

Recording Server Redundancy - New Recordings

As mentioned elsewhere, a Cisco MediaSense cluster may contain up to five servers, each capable of recording up to a specific number of
simultaneous calls. The methodology differs slightly for Unified CM and CUBE calls, but conceptually there are two phases involved. First, the
call controller (Unified CM or CUBE) selects a Cisco MediaSense server (the

server) to send the initial Invite to, and second, the pilot

pilot

server redirects the Invite to a potentially more appropriate server (the

server) to handle the call. Since any server may function as the

home

pilot server for any call, the first phase is designed to prevent any single point of failure for the initial Invite. The second phase allows Cisco
MediaSense servers to balance the load among themselves without any active support from the call controller. The algorithm used is aware
of the state of all recording servers within the cluster and will not direct recordings to failed servers, or servers with critically low disk space or
other impacted conditions. It also ensures that the two media streams associated with a given call are recorded on the same server.

Unified CM should be configured such that it sends its Invites to each server in succession, in round robin fashion. This ensures that recording
servers are fully equal to each other in terms of initial SIP Invite preference, and avoids situations in which one server receives the bulk of
Invites. As CUBE does not support a round robin distribution, it should instead be configured to always deliver its Invites to one particular
Cisco MediaSense server, with a second and perhaps a third server configured as lower preference alternatives. If possible, it is best to target
an Expansion server rather than a Primary or Secondary server for the pilot role, only because Expansion servers are typically doing less work
at any given time.

If any recording server is down or its network is disconnected, it cannot respond to the call controller's SIP Invite. The usual SIP processing for
both Unified CM and CUBE in this case is to deliver the Invite to the next server in the list, thereby also implementing a form of redundancy.
However, it must wait for a timeout to expire before determining that it must try another server. The SIP specification actually envisions it trying
the same server several times, with progressively growing timeouts, before determining that the targeted server is unavailable. Since Unified
CM and CUBE only involve recording servers

the primary media path has already been established, such operations can clearly take

after

much too long for the resulting recording to be useful. Unified CM in fact sets a time limit beyond which, if the recording hasn't begun, it will
stop trying. The net result is that if Unified CM selects a recording server which is not responding, the call in question will most likely not be
recorded. CUBE does not have such a time limit; such calls will end up being recorded, but a substantial initial segment of the call will be
clipped.

To reduce the likelihood of losing recordings due to a recording server failure, Cisco MediaSense works with Unified CM and CUBE to support
a facility known as "SIP Options Ping". This allows the call controller to periodically probe each recording server to make sure it is up and
running, without having to do so while a conversation is literally waiting to be recorded. Once it is aware that a given Cisco MediaSense server
is not running, the call controller will skip that server as it traverses its round robin or sequential list of recording servers, thereby distributing
the incoming load across the remaining servers. In single-node deployments however, SIP Options Ping is not recommended. Not only is it
not helpful, but it can in fact result in unnecessary failure recovery delays. The Cisco MediaSense User Guide contains instructions for
configuring the SIP Options Ping facility as well as other CUBE and Unified CM SIP parameters.

From a sizing perspective, be sure to provision enough recording ports so that if one server fails, you still have enough capacity to capture all
the expected concurrent calls. Similarly, the amount of storage space available for recording session retention will also be impacted.

Recording Server Redundancy - Recordings in Progress

If a recording server fails, all calls which are currently being captured on that server are changed from an ACTIVE state to an ERROR state,
and the contents are discarded. Note that the detection of such failed calls, and therefore the state change, may not occur for some time, on
the order of an hour or two.

There is currently no ability to continue or to transfer in-progress recordings to an alternate server.

Recording Server Redundancy - Saved Recordings

After a recording is completed, Cisco MediaSense retains that recording on the same server which captured it. If that server goes out of
service, then none of its recordings will be available for playback, conversion, or download during that period, even though information about
them can still be found in the metadata.

Metadata Database Redundancy

The Primary and Secondary servers (which we will call the "database servers" in this section) each maintain a database for metadata and
configuration data. They also each implement the Cisco MediaSense API, including the ability to publish events to subscribed clients. Once
deployed, the two database servers are fully symmetric: the databases are fully replicated such that writing to either one causes the other to
be updated as well. Clients may also address their HTTP API requests to either server, and use the alternate server as a fallback in case of
failure.