Intel 253666-024US Manuel D’Utilisation

Page de 760
Vol. 2A 3-565
INSTRUCTION SET REFERENCE, A-M
MASKMOVDQU—Store Selected Bytes of Double Quadword
MASKMOVDQU—Store Selected Bytes of Double Quadword
Description
Stores selected bytes from the source operand (first operand) into an 128-bit 
memory location. The mask operand (second operand) selects which bytes from the 
source operand are written to memory. The source and mask operands are XMM 
registers. The location of the first byte of the memory location is specified by DI/EDI 
and DS registers. The memory location does not need to be aligned on a natural 
boundary. (The size of the store address depends on the address-size attribute.) 
The most significant bit in each byte of the mask operand determines whether the 
corresponding byte in the source operand is written to the corresponding byte loca-
tion in memory: 0 indicates no write and 1 indicates write. 
The MASKMOVEDQU instruction generates a non-temporal hint to the processor to 
minimize cache pollution. The non-temporal hint is implemented by using a write 
combining (WC) memory type protocol (see “Caching of Temporal vs. Non-Temporal 
Data” in Chapter 10, of thIntel® 64 and IA-32 Architectures Software Developer’s 
Manual, Volume 1
). Be
cause the WC protocol uses a weakly-ordered memory consis-
tency model, a fencing operation implemented with the SFENCE or MFENCE instruc-
tion should be used in conjunction with MASKMOVEDQU instructions if multiple 
processors might use different memory types to read/write the destination memory 
locations.
Behavior with a mask of all 0s is as follows:
No data will be written to memory. 
Signaling of breakpoints (code or data) is not guaranteed; different processor 
implementations may signal or not signal these breakpoints.
Exceptions associated with addressing memory and page faults may still be 
signaled (implementation dependent).
If the destination memory region is mapped as UC or WP, enforcement of 
associated semantics for these memory types is not guaranteed (that is, is 
reserved) and is implementation-specific. 
The MASKMOVDQU instruction can be used to improve performance of algorithms 
that need to merge data on a byte-by-byte basis. MASKMOVDQU should not cause a 
read for ownership; doing so generates unnecessary bandwidth since data is to be 
written directly using the byte-mask without allocating old data prior to the store. 
Opcode
Instruction
64-Bit 
Mode
Compat/
Leg Mode
Description
66 0F F7 /MASKMOVDQU 
xmm1xmm2
Valid
Valid
Selectively write bytes from xmm1 to 
memory location using the byte mask in 
xmm2. The default memory location is 
specified by DS:EDI.