Intel 253668-032US User Manual

Page of 806
Vol. 3   11-7
MEMORY CACHE CONTROL
Processors based on Intel Core microarchitectures implement one level of instruction 
TLB and two levels of data TLB. Intel Core i7 processor provides a second-level 
unified TLB. 
The store buffer is associated with the processors instruction execution units. It 
allows writes to system memory and/or the internal caches to be saved and in some 
cases combined to optimize the processor’s bus accesses. The store buffer is always 
enabled in all execution modes.
The processor’s caches are for the most part transparent to software. When enabled, 
instructions and data flow through these caches without the need for explicit soft-
ware control. However, knowledge of the behavior of these caches may be useful in 
optimizing software performance. For example, knowledge of cache dimensions and 
replacement algorithms gives an indication of how large of a data structure can be 
operated on at once without causing cache thrashing.
In multiprocessor systems, maintenance of cache consistency may, in rare circum-
stances, require intervention by system software. For these rare cases, the processor 
provides privileged cache control instructions for use in flushing caches and forcing 
memory ordering.
The Pentium III, Pentium 4, and Intel Xeon processors introduced several instructions 
that software can use to improve the performance of the L1, L2, and L3 caches, 
including the PREFETCHh and CLFLUSH instructions and the non-temporal move 
instructions (MOVNTI, MOVNTQ, MOVNTDQ, MOVNTPS, and MOVNTPD). The use of 
these instructions are discussed in Section 11.5.5, “Cache Management Instruc-
tions.”
11.2 CACHING 
TERMINOLOGY
IA-32 processors (beginning with the Pentium processor) and Intel 64 processors use 
the MESI (modified, exclusive, shared, invalid) cache protocol to maintain consis-
tency with internal caches and caches in other processors (see Section 11.4, “Cache 
Control Protocol”).
When the processor recognizes that an operand being read from memory is cache-
able, the processor reads an entire cache line into the appropriate cache (L1, L2, L3, 
or all). This operation is called a cache line fill. If the memory location containing 
that operand is still cached the next time the processor attempts to access the 
operand, the processor can read the operand from the cache instead of going back to 
memory. This operation is called a cache hit
When the processor attempts to write an operand to a cacheable area of memory, it 
first checks if a cache line for that memory location exists in the cache. If a valid 
cache line does exist, the processor (depending on the write policy currently in force) 
can write the operand into the cache instead of writing it out to system memory. This 
operation is called a write hit. If a write misses the cache (that is, a valid cache line 
is not present for area of memory being written to), the processor performs a cache 
line fill, write allocation. Then it writes the operand into the cache line and