VMware 4 用户手册

下载
页码 19
3
VMware white paper
VMware® Fault Tolerance (FT) provides continuous availability to virtual machines, eliminating downtime and disruption — even in 
the event of a complete host failure. This whitepaper gives a brief description of the VMware FT architecture and discusses the  
performance implication of this feature with data from a wide variety of workloads. 
1. VMware Fault Tolerance architecture
The technology behind VMware Fault Tolerance is called VMware® vLockstep. The following sections describe some of the key aspects 
of VMware vLockstep technology.
1.1. Deterministic record/replay
Deterministic Record/Replay is a technology introduced with VMware Workstation 6.0 that allows for capturing the execution of a 
running virtual machine for later replay. Deterministic replay of computer execution is challenging since external inputs like incoming 
network packets, mouse, keyboard, and disk I/O completion events operate asynchronously and trigger interrupts that alter the code 
execution path. Deterministic replay could be achieved by recording non-deterministic inputs and then by injecting those inputs at 
the same execution point during replay (see 
). This method greatly reduces processing resources and space as compared to 
exhaustively recording and replaying individual instructions.
Figure 1. Event Injection during Replay
 
Disk I/O
Timer Event
In order to efficiently inject the inputs at the correct execution point, some processor changes were required. VMware collaborated 
with AMD and Intel to make sure all currently shipping Intel and AMD server processors support these changes. See
 for a list of supported processors.
VMware currently supports record/replay only for uniprocessor virtual machines. Record/Replay of symmetric multi-processing (SMP) 
virtual machines is more challenging because in addition to recording all external inputs, the order of shared memory access also has 
to be captured for deterministic replay. 
1.2. Fault Tolerance Logging Traffic
 shows the high level architecture of VMware Fault Tolerance.
VMware FT relies on deterministic record/replay technology described above. When VMware FT is enabled for a virtual machine (“the 
primary”), a second instance of the virtual machine (the “secondary”) is created by live-migrating the memory contents of the primary 
using VMware® VMotion™. Once live, the secondary virtual machine runs in lockstep and effectively mirrors the guest instruction 
execution of the primary. 
The hypervisor running on the primary host captures external inputs to the virtual machine and transfers them asynchronously to the 
secondary host. The hypervisor running on the secondary host receives these inputs and injects them into the replaying virtual machine 
at the appropriate execution point. The primary and the secondary virtual machines share the same virtual disk on shared storage, but 
all I/O operations are performed only on the primary host. While the hypervisor does not issue I/O produced by the secondary, it posts 
all I/O completion events to the secondary virtual machine at the same execution point as they occurred on the primary.