AMD Typewriter x86 사용자 설명서
52
Store-to-Load Forwarding Restrictions
AMD Athlon™ Processor x86 Code Optimization
22007E/0—November 1999
Narrow-to-Wide
Store-Buffer Data
Forwarding
Restriction
Store-Buffer Data
Forwarding
Restriction
I f t h e f o l l o w i n g c o n d i t i o n s a r e p r e s e n t , t h e re i s a
narrow-to-wide store-buffer data forwarding restriction:
narrow-to-wide store-buffer data forwarding restriction:
■
The operand size of the store data is smaller than the
operand size of the load data.
operand size of the load data.
■
The range of addresses spanned by the store data covers
some sub-region of range of addresses spanned by the load
data.
some sub-region of range of addresses spanned by the load
data.
Avoid the type of code shown in the following two examples.
Example 1 (Avoid):
MOV EAX, 10h
MOV WORD PTR [EAX], BX
MOV WORD PTR [EAX], BX
;word store
...
MOV ECX, DWORD PTR [EAX]
MOV ECX, DWORD PTR [EAX]
;doubleword load
;cannot forward upper
; byte from store buffer
;cannot forward upper
; byte from store buffer
Example 2 (Avoid):
MOV EAX, 10h
MOV BYTE PTR [EAX + 3], BL ;byte store
...
MOV ECX, DWORD PTR [EAX] ;doubleword load
MOV BYTE PTR [EAX + 3], BL ;byte store
...
MOV ECX, DWORD PTR [EAX] ;doubleword load
;cannot forward upper byte
; from store buffer
Wide-to-Narrow
Store-Buffer Data
Forwarding
Restriction
Store-Buffer Data
Forwarding
Restriction
I f t h e f o l l o w i n g c o n d i t i o n s a r e p r e s e n t , t h e re i s a
wide-to-narrow store-buffer data forwarding restriction:
wide-to-narrow store-buffer data forwarding restriction:
■
The operand size of the store data is greater than the
operand size of the load data.
operand size of the load data.
■
The start address of the store data does not match the start
address of the load.
address of the load.
Example 3 (Avoid):
MOV EAX, 10h
ADD DWORD PTR [EAX], EBX
ADD DWORD PTR [EAX], EBX
;doubleword store
MOV CX, WORD PTR [EAX + 2] ;word load-cannot forward high
; word from store buffer
Use example 5 instead of example 4.
Example 4 (Avoid):
MOVQ [foo], MM1 ;store upper and lower half
...
ADD EAX, [foo] ;fine
ADD EDX, [foo+4] ;uh-oh!
...
ADD EAX, [foo] ;fine
ADD EDX, [foo+4] ;uh-oh!