Справочник Пользователя для AMD 250

Скачать
Страница из 384
326
Instruction Latencies
Appendix C
25112
Rev. 3.06
September 2005
Software Optimization Guide for AMD64 Processors
C.8
SSE2 Instructions
Table 19.
SSE2 Instructions
Syntax
Encoding
Decode
type
FPU 
pipe(s)
La
te
ncy
Th
rough
put
Note
Prefix
byte
First
byte
2nd
byte
ModRM byte
ADDPD xmmreg1, 
xmmreg2
66h
0Fh
58h
11-xxx-xxx
Double
FADD
5
1/2
ADDPD xmmreg, 
mem128
66h
0Fh
58h
mm-xxx-xxx
Double
FADD
7
1/2
ADDSD xmmreg1, 
xmmreg2
F2h
0Fh
58h
11-xxx-xxx
DirectPath
FADD
4
1/1
ADDSD xmmreg, 
mem64
F2h
0Fh
58h
mm-xxx-xxx
DirectPath
FADD
6
1/1
ANDNPD xmmreg1, 
xmmreg2
66h
0Fh
55h
11-xxx-xxx
Double
FMUL
3
1/2
ANDNPD xmmreg, 
mem128
66h
0Fh
55h
mm-xxx-xxx
Double
FMUL
5
1/2
ANDPD xmmreg1, 
xmmreg2
66h
0Fh
54h
11-xxx-xxx
Double
FMUL
3
1/2
ANDPD xmmreg, 
mem128
66h
0Fh
54h
mm-xxx-xxx
Double
FMUL
5
1/2
CMPPD xmmreg1, 
xmmreg2, imm8
66h
0Fh
C2h
11-xxx-xxx
Double
FADD
3
1/2
CMPPD xmmreg, 
mem128, imm8
66h
0Fh
C2h
mm-xxx-xxx
Double
FADD
5
1/2
CMPSD xmmreg1, 
xmmreg2, imm8
F2h
0Fh
C2h
11-xxx-xxx
DirectPath
FADD
2
1/1
CMPSD xmmreg, 
mem64, imm8
F2h
0Fh
C2h
mm-xxx-xxx
DirectPath
FADD
4
1/1
COMISD xmmreg1, 
xmmreg2
66h
0Fh
2Fh
11-xxx-xxx
VectorPath
FADD
4
1
COMISD xmmreg, 
mem64
66h
0Fh
2Fh
mm-xxx-xxx
VectorPath
FADD
5
1
CVTDQ2PD xmmreg1, 
xmmreg2
F3h
0Fh
E6h
11-xxx-xxx
Double
FSTORE
5
1/2
CVTDQ2PD xmmreg, 
mem64
F3h
0Fh
E6h
mm-xxx-xxx
Double
FSTORE
7
1/2
Notes:
1. The low half of the result is available one cycle earlier than listed.
2. This is the execution latency for the instruction. The time to complete the external write depends on the memory 
speed and the hardware implementation.