Справочник Пользователя для AMD 250

Скачать
Страница из 384
Chapter 8
Integer Optimizations
181
Software Optimization Guide for AMD64 Processors
25112
Rev. 3.06
September 2005
8.7
Efficient Binary-to-ASCII Decimal Conversion
Fast binary-to-ASCII decimal conversion can be important to the performance of software working 
with text oriented protocols like HTML, such as web servers. The following examples show two 
optimized functions for fast conversion of unsigned integers-to-ASCII decimal strings on 
AMD Athlon 64 and AMD Opteron processors. The code is written for the Microsoft Visual C 
compiler.
The function 
uint_to_ascii_lz
 converts like 
sprintf(sptr, "%010u", x)
. That is, leading zeros 
are retained, whereas 
uint_to_ascii_nlz
 converts like 
sprintf(sptr, "%u", x)
; that is, leading 
zeros are suppressed.
This code can easily be extended to convert signed integers by isolating the sign information and 
computing the absolute value as shown in Listing  on page 130 before starting the conversion process. 
For restricted argument ranges, construct more efficient conversion routines using the same algorithm 
as used for the general case presented here.
The algorithm first splits the input argument into suitably sized blocks by dividing the input by an 
appropriate power of ten and working separately on the quotient and remainder of that division. The 
DIV instruction is avoided as described in “Replacing Division with Multiplication” on page 160. 
Each block is then converted into a fixed-point format that consists of one (decimal) integer digit and 
a binary fraction. This allows the generation of additional decimal digits by repeated multiplication of 
the fraction by 10. For efficiency reasons the algorithm implements this multiplication by multiplying 
by five and moving the binary point to the right by one bit for each step of the algorithm. To avoid 
loop overhead and branch mispredictions, the digit generation loop is completely unrolled. In order to 
maximize parallelism, the code in 
uint_to_ascii_lz
 splits the input into two equally sized blocks 
each of which yields five decimal digits for the result.
Binary-to-ASCII Decimal Conversion Retaining Leading Zeros
__declspec(naked) void __stdcall uint_to_ascii_lz(char *sptr, unsigned int x)
{
   __asm {
      push edi                ; Save as per calling conventions.
      push esi                ; Save as per calling conventions.
      push ebx                ; Save as per calling conventions.
      mov  eax, [esp+20]      ; x
      mov  edi, [esp+16]      ; sptr
      mov  esi, eax           ; x
      mov  edx, 0xA7C5AC47    ; Divide x by
      mul  edx                ;  10,000 using
      add  eax, 0xA7C5AC47    ;  multiplication
      adc  edx, 0             ;  with reciprocal.
      shr  edx, 16            ; y1 = x / 1e5
      mov  ecx, edx           ; y1
      imul edx, 100000        ; (x / 1e5) * 1e5
      sub  esi, edx           ; y2 = x % 1e5
      mov  eax, 0xD1B71759    ; 2^15 / 1e4 * 2^30
      mul  ecx                ; Divide y1 by 1e4,