AMD 250 Manuale Utente

Pagina di 384
96
Cache and Memory Optimizations
Chapter 5
25112
Rev. 3.06
September 2005
Software Optimization Guide for AMD64 Processors
5.3
Cache-Coherent Nonuniform Memory Access 
(ccNUMA)
Optimization
For applications with multiple threads, use OS functions to run a thread on a particular node and let 
that thread allocate the memory that it requires so that the memory used is local to that node. In the 
Microsoft Windows environment, the function to run a thread on a particular node is 
SetThreadAffinityMask( ).
Be sure operating systems are properly configured to support ccNUMA. All versions of Microsoft 
Windows XP for AMD64 and Windows Server for AMD64 support ccNUMA without any changes. 
The 32-bit versions of Windows Server 2003, Enterprise Edition and Windows Server 2003, 
Datacenter Edition require the /PAE boot parameter to support ccNUMA. 
For 64-bit Linux, there may be separate kernels supporting ccNUMA that should be selected.
Application 
This optimization applies to: 
32-bit software
64-bit software
Rationale
Most multiple processor systems available today employ a symmetric multiprocessing (SMP) 
architecture. Processors on an SMP platform generally share a common or centralized memory bus, 
having identical memory access latencies regardless of the processor position. Because the processors 
use the same bus and memory, system performance may be negatively affected when bottlenecks 
occur due to increased demands on the single memory bus. Figure 1 shows a simplified block diagram 
for a two processor SMP system.