Справочник Пользователя для AMD 250

Скачать
Страница из 384
266
Implementation of Write-Combining
Appendix B
25112
Rev. 3.06
September 2005
Software Optimization Guide for AMD64 Processors
B.4
Sending Write-Buffer Data to the System
The maximum write combined throughput is achieved when all quadwords or doublewords are valid 
and the AMD Athlon 64 and AMD Opteron processors can use one efficient 64-byte memory write 
instead of multiple 8-byte memory writes.
B.5
Write-Combining Optimization on 
Revision D and E AMD Athlon™ 64 and 
AMD Opteron™ Processors
The number of Write Combining buffers on revision D and revision E AMD Athlon 64 and AMD 
Opteron processors has changed from earlier CPU revisions. Although the number of buffers 
available for write combining depends on the specific CPU revision, current designs provide as many 
as four write buffers for WC memory mapped I/O address spaces. These same buffers are used for 
streaming store instructions. The number of write-buffers determines how many independent linear 
64-byte streams of WC data the CPU can simultaneously buffer.
Having multiple write-combining buffers that can combine independent WC streams has implications 
on data throughput rates (bandwidth), especially when data is written by the CPU to WC memory 
mapped I/O devices, residing on the AGP, PCI, PCI-X and PCI-E busses including:  
Memory Mapped I/O registers—command FIFO, etc.
Memory Mapped I/O apertures—windows to which the CPU use programmed I/O to send data to 
a hardware device
Sequential block of 2D/3D graphic engine registers written using programmed I/O
Video memory residing on the graphics accelerator—frame buffer, render buffers, textures, etc.
HyperTransport tunnels are HyperTransport-to-bus bridges. There are tunnels for AGP, PCI Express, 
PCI and PCI-X. Examples of  tunnels are the AMD-8151™ graphics tunnel, the AMD-8131™ I/O 
bus tunnel, and the AMD-8132™ PCI-X tunnel. Many HyperTransport tunnels use a hardware 
optimization feature called write-chaining. In write-chaining, the tunnel device buffers and combines 
separate HyperTransport packets of data sent by the CPU, creating one large burst on the underlying 
bus when the data is received by the tunnel in sequential address order. Using larger bursts results in 
WT Nonsequential
If a subsequent WT write is not in ascending sequential order, the 
write-combining completes. WC writes have no addressing 
constraints within the 64-byte line being combined.
TLB AD bit set
Write-combining is closed whenever a TLB reload sets the accessed 
[A] or dirty [D] bits of a Pde or Pte.
Table 12.
Write-Combining Completion Events (Continued)
Event
Comment