Справочник Пользователя для AMD 250

Скачать
Страница из 384
Chapter 9
Optimizing with SIMD Instructions
211
Software Optimization Guide for AMD64 Processors
25112
Rev. 3.06
September 2005
9.11
Using SIMD Instructions for Fast Square Roots 
and Fast Reciprocal Square Roots
Optimization
Use SIMD vectorized square root (SQRTPS) and reciprocation (RCCPS) instructions to calculate 
square roots and reciprocal square roots of single-precision numbers.
Application
This optimization applies to:
32-bit software
64-bit software
Rationale
SIMD instructions exist for performing vectorized square root and reciprocation of single-precision 
numbers. These operations are often used in multimedia applications and also can be utilized in 
scientific arenas, such as molecular dynamics simulations. 
Example
The following function highlights the use of both the vectorized reciprocal and square-root SSE 
instructions:
; reciprocal_sqrt_sse(float *r, float *rcp_sqrt_r, int num_points);
;
; TO ASSEMBLE INTO *.obj DO THE FOLLOWING:
;       ml.exe -coff -c reciprocal_sqrt_sse.asm
;
.586
.K3D
.XMM
_TEXT   SEGMENT
PUBLIC _reciprocal_sqrt_sse
_reciprocal_sqrt_sse PROC NEAR
;==============================================================================
; INSTRUCTIONS BELOW SAVE THE REGISTER STATE WITH WHICH THIS ROUTINE WAS
;  ENTERED.
; REGISTERS EAX, ECX, EDX ARE CONSIDERED VOLATILE AND ASSUMED TO BE CHANGED
;  WHILE THE REGISTERS BELOW MUST BE PRESERVED IF THE USER IS CHANGING THEM
   push ebp
   mov  ebp, esp
;==============================================================================
; Parameters passed into routine: