AMD Typewriter x86 사용자 설명서

다운로드
페이지 256
36
Align Branch Targets in Program Hot Spots
AMD Athlon™ Processor x86 Code Optimization 
22007E/0—November 1999
Example 1 (Avoid):  
FLD 
QWORD PTR [foo]
FIMUL 
DWORD PTR [bar]
FIADD
DWORD PTR [baz]
Example 2 (Preferred):  
 
FILD 
DWORD PTR [bar]
FILD 
DWORD PTR [baz]
FLD 
QWORD PTR [foo]
FMULP
ST(2), ST
FADDP
ST(1),ST
Align Branch Targets in Program Hot Spots
In program hot spots (i.e., innermost loops in the absence of
profiling data), place branch targets at or near the beginning of
16-byte aligned code windows. This technique helps to
maximize the number of instructions that are filled into the
instruction-byte queue while preventing I-cache space in
branch intensive code.
Use Short Instruction Lengths
Assemblers and compilers should generate the tightest code
possible to optimize use of the I-cache and increase average
decode rate. Wherever possible, use instructions with shorter
lengths. Using shorter instructions increases the number of
instructions that can fit into the instruction-byte queue. For
ex am pl e,  us e 8 -b it displ ace m en ts as  opp os ed to  32 -bit
displacements. In addition, use the single-byte format of simple
integer instructions whenever possible, as opposed to the
2-byte opcode ModR/M format.
Example 1 (Avoid):  
81 C0 78 56 34 12  add eax, 12345678h ;uses 2-byte opcode
; form (with ModR/M)
81 C3 FB FF FF FF
add ebx, -5 
;uses 32-bit 
; immediate
0F 84 05 00 00 00
jz  $label1 
;uses 2-byte opcode,
; 32-bit immediate