Справочник Пользователя для AMD 250

Скачать
Страница из 384
112
Cache and Memory Optimizations
Chapter 5
25112
Rev. 3.06
September 2005
Software Optimization Guide for AMD64 Processors
5.7
Streaming-Store/Non-Temporal Instructions
Optimization
Use streaming store instructions such as MOVNTPS and MOVNTQ when writing arrays or buffers 
which do not need to reside in cache. These instructions allow the processor to perform a write 
without first reading the data from memory or other processor's caches. This saves the time needed to 
read the cache line, and also prevents evicting data from the cache which may be needed. This can be 
a significant performance advantage. These instructions are available in most compilers using inline 
assembly or intrinsics. Routines 5 and 6 in Section 5.13, “Appropriate Memory Copying Routines” 
illustrate using the combination of streaming store instructions with the  PREFETCHNTA instruction  
to optimize memory copy routines.
Application
This optimization applies to:
32-bit software
64-bit software
Rationale
Streaming store instructions are also sometimes called write-combining instructions. In order to 
improve system performance, the AMD Athlon 64 and AMD Opteron processors aggressively 
combine multiple memory-write cycles of any data size that address locations within a 64-byte cache-
line-aligned write buffer if a streaming-store instruction is used. This combining is accomplished with 
write-combine buffers. The number of write-combine buffers is processor-implementation dependent. 
Be sure to refer to Appendix B for much more detailed information on write-combining. 
Be sure to follow the last streaming-store instruction in a block of code with the MFENCE instruction 
to assure that all of the write-combine buffers are written to memory.
Streaming Store instructions are also discussed in “Write-Combining Usage” on page 106. Also see 
Appendix B, "Implementation of Write-Combining." For more information on write-combining, see 
"Write-Combining" in the AMD64 Architecture Programmer's Manual Volume 2: System 
Programming
 (order# 24593).