Справочник Пользователя для AMD 250

Скачать
Страница из 384
Appendix D
AGP Considerations
353
Software Optimization Guide for AMD64 Processors
25112
Rev. 3.06
September 2005
frequencies increase, so will the ratio of operating frequencies between processor caches and DDR 
memory. The processor-to-write-back cache bandwidth is also higher than processor-to-AGP-aperture 
bandwidth (write-combining memory type), since the DDR writes are avoided (as well as GART 
translation latencies).
It may be possible to prevent pollution of the L1-data and L2 caches from DMA data by using the 
nontemporal PREFETCHNTA instruction on the DMA buffer and limiting prefetching of the DMA 
buffer to less than 32 Kbytes (PREFETCHNTA uses only one way of the L1 data cache).
Use PREFETCHNTA on the linear address to the DMA buffer, and not the AGP aperture address, 
before reading or writing the DMA buffer.
Another key optimization for the DMA model on AMD Athlon 64 and AMD Opteron systems is that 
coherency is maintained between processor caches and an AGP master making accesses outside of 
the AGP aperture.
This is a key AGP enhancement that is required of AGP 3.0 target (host platform) systems.
In effect, this means that an AGP master can create a DMA buffer in normal write-back memory and 
then pass the physical DRAM page address to the AGP master; in other words, the AGP virtual 
address and GART translation is not used.
Use PREFETCHNTA on the linear address to the DMA buffer, before reading or writing the DMA 
buffer.
If the AGP card hardware is capable of buffering the physical DRAM page addresses sent to the AGP 
card in a FIFO, then in effect the AGP card’s device driver is getting AGP scatter-gather capabilities, 
with cache coherency provided by the processor.
D.6
Optimizations for Texture-Map Copies to AGP 
Memory
To avoid cache pollution, use the same technique described in “Fast-Write Optimizations for Video-
Memory Copies” on page 349 to cop
y texture data into AGP memory, since this data tends to be 
nontemporal.
D.7
Optimizations for Vertex-Geometry Copies to AGP 
Memory
To avoid cache pollution, use the same technique described in “Fast-Write Optimizations for Video-
Memory Copies” on page 349 to cop
y vertex data into AGP memory, since this data tends to be 
nontemporal.