Compaq EV67 User Manual

Page of 356
Alpha 21264/EV67 Hardware Reference Manual
Cache and External Interfaces
4–15
Lock Mechanism
4.6.1 In-Order Processing of LDx_L/STx_C Instructions
The 21264/EV67 uses the stWait logic in the IQ to ensure that LDx_L/STx_C pairs are 
issued in order. The stWait logic treats an Ldx_L instruction like Stx instructions. 
STx_C instructions are always loaded into the IQ with their associate stWait bit set. 
Thus, a STx_C instruction is not issued until the older LDx_L is out of the IQ.
4.6.2 Internal Eviction of LDx_L Blocks
The 21264/EV67 prevents the eviction of cache blocks in the Dcache due to either of 
the following references:
Istream references with a Bcache index that matches the Dcache block and a 
Bcache tag that mismatches the Dcache block.
To avoid evictions of LDx_L blocks, Istream references that match the index of a 
block in the Dcache are converted to noncached references.
Ldx or Stx references with a Dcache index that matches the block.
In the Alpha architecture, Dstream references between a LDx_L/STx_C pair force 
the value of the STx_C success flag to be UNPREDICTABLE. The 21264/EV67 
forces all STx_C instructions that interrupt an LDx_L/STx_C pair to fail in pro-
gram order. 
There should be no Dstream references between LDx_L/STx_C pairs; however, the 
out-of-order nature of the 21264/EV67 can introduce Dstream references between 
LDx_L/STx_C pairs. To prevent load or store instructions older than the LDx_L 
from evicting the LDx_L cache block, the Mbox invokes a replay trap on the 
incoming load or store instruction, which also aborts the LDx_L. These instructions 
are issued in program order in the next iteration of the trap retry down the pipeline.  
To prevent newer load or store instructions from evicting the locked cache line, the 
Ibox ensures that a STx_C is issued before any newer load or store instruction by 
placing the STx_C into the IQ and stalling all subsequent instructions in the map 
stage of the pipe until the IQ is empty.
Branch instructions between the LDx_L/STx_C pair may be mispredicted, intro-
ducing load and store instructions that evict the locked cache block. To prevent that 
from happening, there is a bit in the instruction fetcher that is set for a LDx_L refer-
ence and cleared on any other memory reference. When this bit is set, the branch 
predictor predicts all branches to fall through.
4.6.3 Liveness and Fairness
To prevent a livelock condition, the 21264/EV67 processes the STx_C as follows:
1. If a STx_C misses the Dcache, then no system port transaction is started and the 
STx_C fails.
2. If a STx_C hits a block that is not dirty, then a ChangeToDirty (Shared or Clean) is 
launched after the STx_C retires and all older store queue entries are in the writable 
state. This ensures that once the ChangeToDirty command is launched on behalf of 
the STx_C, the STx_C will be executed to completion if the ChangeToDirty com-
mand succeeds.