Справочник Пользователя для AMD 250

Скачать
Страница из 384
Appendix A Microarchitecture for AMD Athlon™ 64 and AMD Opteron™ Processors
259
Software Optimization Guide for AMD64 Processors
25112
Rev. 3.06
September 2005
cache and, if required, to the L2 cache or system memory. The 44-entry LSU provides a data interface 
for both the integer scheduler and the floating-point scheduler. It consists of two queues—a 12-entry 
queue for L1 cache load and store accesses and a 32-entry queue for L2 cache or system memory load 
and store accesses. The 12-entry queue can request a maximum of two L1 cache operations (and mix 
of loads and stores) per cycle. Up to two 64-bit stores can be performed per cycle. In other words, 
16 bytes per clock is the maximum rate at which the processor can move data. The 32-entry queue 
effectively holds requests that missed in the L1 cache probe by the 12-entry queue. Finally, the LSU 
helps ensure that the architectural load and store ordering rules are preserved (a requirement for 
AMD64 architecture compatibility). 
Figure 9.
Load-Store Unit
A.16
L2 Cache
The AMD Athlon 64 and AMD Opteron processors each contain an integrated L2 cache. This full-
speed on-die L2 cache features an exclusive cache architecture. The L2 cache contains only victim or 
copy-back cache blocks that are to be written back to the memory subsystem as a result of a conflict 
miss. These terms, victim or copy-back, refer to cache blocks that were previously held in the L1 
cache but had to be overwritten (evicted) to make room for newer data. The victim buffer contains 
data evicted from the L1 cache.
LSU
44-Entry
Data
Cache
2-Way
64 Kbytes
Operand
Buses
Result Buses
from
Core
Store Data
to BIU