Справочник Пользователя для AMD 250
356
SSE and SSE2 Optimizations
Appendix E
25112
Rev. 3.06
September 2005
Software Optimization Guide for AMD64 Processors
E.1
Half-Register Operations
Optimization
❖
Take care when mixing data types of operands within the same register.
Application
This optimization applies to:
•
32-bit software
•
64-bit software
Rationale
Mixing data types in a single register is harmless if only scalar operations are used. However, this
practice can cause performance problems if the register is used as a sourcce for a vector operation.
practice can cause performance problems if the register is used as a sourcce for a vector operation.
Example 1
Avoid code like this:
addps xmm1, xmm2 ; Add four packed single-precision (FPD) values in XMM1
; to their corresponding values in XMM2.
cvtss2sd xmm1, xmm2 ; Convert the low-order single-precision value in XMM2
; to 64-bit double precision FP format and store in
; lower 64-bits of XMM1.
In this example, the second instruction leaves the upper half of XMM1 in FPS format and the lower
half in FPD format.
half in FPD format.
Example 2
Avoid code like this:
addps xmm1,xmm2 ; Add four packed single-precision (FPD) values in XMM1
; to their corresponding values in XMM2.
movlpd xmm1,mem64 ; Move the double-precision value in mem64 to the lower
; half of XMM1.
In this example, The MOVLPD instruction sets the low half of XMM1 to FPD format but leaves the
high half unchanged (in FPS format).
high half unchanged (in FPS format).