Справочник Пользователя для AMD 250

Скачать
Страница из 384
356
SSE and SSE2 Optimizations
Appendix E
25112
Rev. 3.06
September 2005
Software Optimization Guide for AMD64 Processors
E.1
Half-Register Operations
Optimization
Take care when mixing data types of operands within the same register.
Application
This optimization applies to:
32-bit software
64-bit software
Rationale
Mixing data types in a single register is harmless if only scalar operations are used. However, this 
practice can cause performance problems if the register is used as a sourcce for a vector operation.
Example 1
Avoid code like this:
addps  xmm1, xmm2       ; Add four packed single-precision (FPD) values in XMM1
                        ; to their corresponding values in XMM2.
cvtss2sd xmm1, xmm2     ; Convert the low-order single-precision value in XMM2
                        ; to 64-bit double precision FP format and store in
                        ; lower 64-bits of XMM1.
In this example, the second instruction leaves the upper half of XMM1 in FPS format and the lower 
half in FPD format.
Example 2
Avoid code like this:
addps  xmm1,xmm2         ; Add four packed single-precision (FPD) values in XMM1
                         ; to their corresponding values in XMM2.
movlpd xmm1,mem64        ; Move the double-precision value in mem64 to the lower
                         ; half of XMM1.
In this example, The MOVLPD instruction sets the low half of XMM1 to FPD format but leaves the 
high half unchanged (in FPS format).