Intel 253668-032US User Manual

Page of 806
8-2   Vol. 3
MULTIPLE-PROCESSOR MANAGEMENT
To distribute interrupt handling among a group of processors — When several 
processors are operating in a system in parallel, it is useful to have a centralized 
mechanism for receiving interrupts and distributing them to available processors 
for servicing.
To increase system performance by exploiting the multi-threaded and multi-
process nature of contemporary operating systems and applications.
The caching mechanism and cache consistency of Intel 64 and IA-32 processors are 
discussed in Chapter 11. The APIC architecture is described in Chapter 10. Bus and 
memory locking, serializing instructions, memory ordering, and Intel Hyper-
Threading Technology are discussed in the following sections. 
8.1 LOCKED 
ATOMIC 
OPERATIONS
The 32-bit IA-32 processors support locked atomic operations on locations in system 
memory. These operations are typically used to manage shared data structures (such 
as semaphores, segment descriptors, system segments, or page tables) in which two 
or more processors may try simultaneously to modify the same field or flag. The 
processor uses three interdependent mechanisms for carrying out locked atomic 
operations:
Guaranteed atomic operations
Bus locking, using the LOCK# signal and the LOCK instruction prefix
Cache coherency protocols that insure that atomic operations can be carried out 
on cached data structures (cache lock); this mechanism is present in the 
Pentium 4, Intel Xeon, and P6 family processors
These mechanisms are interdependent in the following ways. Certain basic memory 
transactions (such as reading or writing a byte in system memory) are always guar-
anteed to be handled atomically. That is, once started, the processor guarantees that 
the operation will be completed before another processor or bus agent is allowed 
access to the memory location. The processor also supports bus locking for 
performing selected memory operations (such as a read-modify-write operation in a 
shared area of memory) that typically need to be handled atomically, but are not 
automatically handled this way. Because frequently used memory locations are often 
cached in a processor’s L1 or L2 caches, atomic operations can often be carried out 
inside a processor’s caches without asserting the bus lock. Here the processor’s 
cache coherency protocols insure that other processors that are caching the same 
memory locations are managed properly while atomic operations are performed on 
cached memory locations.
NOTE
Where there are contested lock accesses, software may need to 
implement algorithms that ensure fair access to resources in order to 
prevent lock starvation. The hardware provides no resource that 
guarantees fairness to participating agents. It is the responsibility of