Справочник Пользователя для AMD 250

Скачать
Страница из 384
128
Branch Optimizations
Chapter 6
25112
Rev. 3.06
September 2005
Software Optimization Guide for AMD64 Processors
6.2
Two-Byte Near-Return RET Instruction
Optimization
Use of a two-byte near-return can improve performance. The single-byte near-return (opcode C3h) of 
the RET instruction should be used carefully. Specifically, avoid the following two situations:
Any kind of branch (either conditional or unconditional) that has the single-byte near-return RET 
instruction as its target. See “Examples.” 
A conditional branch that occurs in the code directly before the single-byte near-return RET 
instruction. See “Examples.” 
Application
This optimization applies to:
32-bit software
64-bit software
Rationale
The processor is unable to apply a branch prediction to the single-byte near-return form (opcode C3h) 
of the RET instruction.
The easiest way to assure the utilization of the branch prediction mechanism is to use a two-byte RET 
instruction. A two-byte RET has a REP instruction inserted before the RET, which produces the 
functional equivalent of the single-byte near-return RET instruction, but is not affected by the 
prediction limitations outlined above. To use a two-byte RET, define a text macro named 
REPRET
 and 
use it instead of the RET instruction to force the intended object code.
REPRET TEXTEQU <DB 0F3h, 0C3h>
Examples
Avoid branches in which the target of the branch is a single-byte near-return:
   jmp label   ; Jump to a single-byte near-return RET instruction.
   ...
label:
   ret         ; RET is potentially mispredicted.
Avoid branches that immediately precede a single-byte near-return:
jz  label   ; Conditional branch is not taken.
ret         ; RET is a fall-through instruction,
            ;  potentially mispredicted.