Справочник Пользователя для AMD 250

Скачать
Страница из 384
192
Integer Optimizations
Chapter 8
25112
Rev. 3.06
September 2005
Software Optimization Guide for AMD64 Processors
8.9
Optimizing Integer Division
Optimization
When possible, use smaller data types for integer division.
Application
This optimization applies to:
32-bit software
64-bit software
Rationale
Division by a 16-bit value is significantly faster than division by a 32-bit value—about a 26 clock 
latency versus 42. Likewise, division by a 32-bit value is faster than division by a 64-bit value—about 
42 clocks versus 74. Refer to IDIV in table 15. In algorithms in which integer division contributes a 
substantial component to performance, it may be beneficial to check whether using a smaller divide 
type is possible. Study the assembly language output generated by high-level language compilers to 
verify that the desired code is generated. Compilers often generate code that converts 16-bit types into 
32-bit values that are then used to perform 32-bit division, thus eliminating the advantage of using 16-
bit integer types. If the compiler cannot be coerced into producing the desired code, then compiler 
intrinsics or assembly language are required.