Справочник Пользователя для AMD 250

Скачать
Страница из 384
Chapter 9
Optimizing with SIMD Instructions
195
Software Optimization Guide for AMD64 Processors
25112
Rev. 3.06
September 2005
9.1
Ensure All Packed Floating-Point Data are Aligned
Optimization
Align all packed floating-point data on 16-byte boundaries. 
Application
This optimization applies to:
32-bit software
64-bit software
Rationale
Misaligned memory accesses reduce the available memory bandwidth and SSE and SSE2 instructions 
have shorter latencies when operating on aligned memory operands.
Aligning data on 16-byte boundaries allows you to use the aligned load instructions (MOVAPS, 
MOVAPD, and MOVDQA), which move through the floating-point unit with shorter latencies and 
reduce the possibility of stalling addition or multiplication instructions that are dependent on the load 
data.