Compaq EV67 User Manual

Page of 356
Alpha 21264/EV67 Hardware Reference Manual
Internal Architecture
2–29
Memory and I/O Address Space Instructions
2.8.3 Memory Address Space Store Instructions
The Mbox begins execution of a store instruction by translating its virtual address to a 
physical address using the DTB and by probing the Dcache. The Mbox puts informa-
tion about the store instruction, including its physical address, its data and the results of 
the Dcache probe, into the store queue (SQ).
If the Mbox does not find the addressed location in the Dcache, it places the address 
into the MAF for processing by the Cbox. If the Mbox finds the addressed location in a 
Dcache block that is not dirty, then it places a ChangeToDirty request into the MAF.
A store instruction can write its data into the Dcache when it is retired, and when the 
Dcache block containing its address is dirty and not shared. SQ entries that meet these 
two conditions can be placed into the writable state. These SQ entries are placed into 
the writable state in program order at a maximum rate of two entries per cycle. The 
Mbox transfers writable store queue entry data from the SQ to the Dcache in program 
order at a maximum rate of two entries per cycle. Dcache lines associated with writable 
store queue entries are locked by the Mbox. System port probe commands cannot evict 
these blocks until their associated writable SQ entries have been transferred into the 
Dcache. This restriction assists in STx_C instruction and Dcache ECC processing.
SQ entry data that has not been transferred to the Dcache may source data to newer load 
instructions. The Mbox compares the virtual Dcache index bits of incoming load 
instructions to queued SQ entries, and sources the data from the SQ, bypassing the 
Dcache, when necessary.
2.8.4 I/O Address Space Store Instructions
The Mbox begins processing I/O space store instructions, like memory space store 
instructions, by translating the virtual address and placing the state associated with the 
store instruction into the SQ.
The Mbox replays retired I/O space store entries from the SQ to the IOWB in program 
order at a rate of one per GCLK cycle. The Mbox never allows queued I/O space store 
instructions to source data to subsequent load instructions.  
The Cbox maximizes I/O bandwidth when it allocates a new IOWB entry to an I/O 
store instruction by attempting to merge I/O store instructions in a merge register. Table 
2–8 s
hows the rules for I/O space store instruction data merging. The columns represent 
the load instructions replayed to the IOWB while the rows represent the size of the store 
in the merge register.
Table 2–8 shows some of the following rules:
Table 2–8 Rules for I/O Address Space Store Instruction Data Merging
Merge Register/
Replayed Instruction
Store
Byte/Word
Store Longword
Store Quadword
Byte/Word
No merge
No merge
No merge
Longword
No merge
Merge up to 32 bytes
No merge
Quadword
No merge
No merge
Merge up to 64 bytes